Fugu-MT: arxivの論文翻訳

このサイトではarxivで発表された論文のメタデータを翻訳しています。（arxivのメタデータは CC 0です）翻訳文のライセンスはCC BY-SA 4.0です。翻訳にはFugu-Machine Translatorを利用しています。

本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。技術的な詳細は開発者のBlogで紹介します。

下表は最大200件を表示しています。

Title	Authors	Abstract	論文公表日・翻訳日
# 脳モデルとしての概念価値ネットワーク A Concept-Value Network as a Brain Model ( http://arxiv.org/abs/1904.04579v6 ) ライセンス: Link先を確認	Kieran Greer,	(参考訳) 本稿では,脳様モデルの物理的実体と概念的実体の関係を記述するための統計的枠組みを提案する。特徴と概念のインスタンスはコンテキストに置かれ、化学接続も可能であるが、この論文は特徴が電気配線である可能性を示唆している。この考え方では、実際の接続長は、発射速度とニューロン同期と関係があるため重要であるが、信号タイプはそれほど重要ではない。この論文は、概念が特徴集合と概念インスタンスをリンクするニューロン群であり、それらのグループからの化学信号によって決定されることを示唆している。したがって、特徴はニューラルネットワークの静的水平フレームワークとなり、概念はこれらを垂直に相互に結合する。機能に関して、ニューロンは機能的と考えられ、より水平な記憶構造はグリアとなる。これはまた、機能が分散エンティティであり、単一の領域に集中していないことを示唆する。もう一つの側面は、パターンを分解し、神経結合に役立つシグナル「ブレーク」である。 This paper suggests a statistical framework for describing the relations between the physical and conceptual entities of a brain-like model. Features and concept instances are put into context, where the paper suggests that features may be the electrical wiring, although chemical connections are also possible. With this idea, the actual length of the connection is important, because it is related to firing rates and neuron synchronization, but the signal type is less important. The paper then suggests that concepts are neuron groups that link feature sets and concept instances are determined by chemical signals from those groups. Therefore, features become the static horizontal framework of the neural system and concepts are vertically interconnected combinations of these. With regards to functionality, the neuron is then considered to be functional and the more horizontal memory structures can even be glial. This would also suggest that features can be distributed entities and not concentrated to a single area. Another aspect could be signal 'breaks' that compartmentalise a pattern and may help with neural binding.	公開日:2024-09-26 翻訳日:2024-11-09 16:01:17
# 確率的要求による車両経路問題のゲーミフィケーション Gamifying the Vehicle Routing Problem with Stochastic Requests ( http://arxiv.org/abs/1911.05922v2 ) ライセンス: Link先を確認	Nicholas D. Kullman, Nikita Dudorov, Jorge E. Mendoza, Martin Cousineau, Justin C. Goodson,	(参考訳) あなたの最初のビデオゲームコンソールを覚えていますか。私たちは自分のことを思い出す。数十年前、彼らは何時間もエンターテイメントを提供していた。現在、動的および確率的最適化問題を解くためにそれらを再利用している。幅広いアタリゲームに超人的パフォーマンスをポストする深層強化学習手法により,古典的な物流問題をゲームとして表現する作業を考える。その後、エージェントを訓練してプレイします。確率的要求を伴う車両経路問題のゲーム設計について検討する。パースペクティブ、視野、ミニマップなど、さまざまなデザイン特徴がエージェントのパフォーマンスにどのように影響するかを示す。適切なゲーム設計では、一般的な目的であるAtariエージェントは、特に問題のサイズが大きくなるにつれて、最適化ベースのベンチマークを上回ります。我々の研究は、ゲームによる動的および確率的最適化問題の表現を、有望な研究方向として示している。 Do you remember your first video game console? We remember ours. Decades ago, they provided hours of entertainment. Now, we have repurposed them to solve dynamic and stochastic optimization problems. With deep reinforcement learning methods posting superhuman performance on a wide range of Atari games, we consider the task of representing a classic logistics problem as a game. Then, we train agents to play it. We consider several game designs for the vehicle routing problem with stochastic requests. We show how various design features impact agents' performance, including perspective, field of view, and minimaps. With the right game design, general purpose Atari agents outperform optimization-based benchmarks, especially as problem size grows. Our work points to the representation of dynamic and stochastic optimization problems via games as a promising research direction.	公開日:2024-09-23 翻訳日:2024-11-09 16:01:17
# 脳発見のためのマルチレゾリューショングラフエッジ埋め込みの学習神経疾患におけるネットワーク機能障害 Learning Multi-resolution Graph Edge Embedding for Discovering Brain Network Dysfunction in Neurological Disorders ( http://arxiv.org/abs/1912.01181v1 ) ライセンス: Link先を確認	Xin Ma, Guorong Wu, Seong Jae Hwang, Won Hwa Kim	(参考訳) 最近の異種の文献では、異なる脳領域、すなわち脳の接続が神経疾患の早期症状をもたらすことが示されている。グラフニューラルネットワーク(GNN)技術に対する大きな取り組みにも関わらず、グラフノードに重点を置いているため、現在の最先端のGNNメソッドは、グラフリンク上の疾患関連ネットワーク障害パターンを特徴付けることを目的としたグラフとして、脳接続を分類するのに適さない。この問題に対処するために,診断カテゴリ間で高い判別能力を有する病原性結合性ベンチマークを検出するためのマルチレゾリューションエッジネットワーク(MENET)を提案する。 MENETの中核は、我々が提案する新しいグラフエッジワイド変換であり、マルチ解像度 ``connectomic'' 機能をキャプチャすることができる。連結特徴の豊富な集合を用いて、識別エッジを共同で選択し、グラフの診断ラベルを割り当てるグラフ学習フレームワークを考案する。 2つの実際のデータセットでの実験により、MENETは診断ラベルを正確に予測し、アルツハイマー病や注意・抑止・多動性障害などの神経疾患と密接に関連している脳の結合性を特定する。 Tremendous recent literature show that associations between different brain regions, i.e., brain connectivity, provide early symptoms of neurological disorders. Despite significant efforts made for graph neural network (GNN) techniques, their focus on graph nodes makes the state-of-the-art GNN methods not suitable for classifying brain connectivity as graphs where the objective is to characterize disease-relevant network dysfunction patterns on graph links. To address this issue, we propose Multi-resolution Edge Network (MENET) to detect disease-specific connectomic benchmarks with high discrimination power across diagnostic categories. The core of MENET is a novel graph edge-wise transform that we propose, which allows us to capture multi-resolution ``connectomic'' features. Using a rich set of the connectomic features, we devise a graph learning framework to jointly select discriminative edges and assign diagnostic labels for graphs. Experiments on two real datasets show that MENET accurately predicts diagnostic labels and identify brain connectivities highly associated with neurological disorders such as Alzheimer's Disease and Attention-Deficit/Hyperactivity Disorder.	公開日:2024-09-26 翻訳日:2024-11-09 16:01:17
# 神経障害における脳ネットワーク障害発見のための多分解能グラフエッジ埋め込みの学習 Learning Multi-resolution Graph Edge Embedding for Discovering Brain Network Dysfunction in Neurological Disorders ( http://arxiv.org/abs/1912.01181v2 ) ライセンス: Link先を確認	Xin Ma, Guorong Wu, Seong Jae Hwang, Won Hwa Kim,	(参考訳) 最近の異種の文献では、異なる脳領域、すなわち脳の接続が神経疾患の早期症状をもたらすことが示されている。グラフニューラルネットワーク(GNN)技術に対する大きな取り組みにも関わらず、グラフノードに重点を置いているため、現在の最先端のGNNメソッドは、グラフリンク上の疾患関連ネットワーク障害パターンを特徴付けることを目的としたグラフとして、脳接続を分類するのに適さない。この問題に対処するために,診断カテゴリ間で高い判別能力を有する病原性結合性ベンチマークを検出するためのマルチレゾリューションエッジネットワーク(MENET)を提案する。 MENETの中核は、我々が提案する新しいグラフエッジワイド変換であり、マルチ解像度 ``connectomic'' 機能をキャプチャすることができる。連結特徴の豊富な集合を用いて、識別エッジを共同で選択し、グラフの診断ラベルを割り当てるグラフ学習フレームワークを考案する。 2つの実際のデータセットでの実験により、MENETは診断ラベルを正確に予測し、アルツハイマー病や注意・抑止・多動性障害などの神経疾患と密接に関連している脳の結合性を特定する。 Tremendous recent literature show that associations between different brain regions, i.e., brain connectivity, provide early symptoms of neurological disorders. Despite significant efforts made for graph neural network (GNN) techniques, their focus on graph nodes makes the state-of-the-art GNN methods not suitable for classifying brain connectivity as graphs where the objective is to characterize disease-relevant network dysfunction patterns on graph links. To address this issue, we propose Multi-resolution Edge Network (MENET) to detect disease-specific connectomic benchmarks with high discrimination power across diagnostic categories. The core of MENET is a novel graph edge-wise transform that we propose, which allows us to capture multi-resolution ``connectomic'' features. Using a rich set of the connectomic features, we devise a graph learning framework to jointly select discriminative edges and assign diagnostic labels for graphs. Experiments on two real datasets show that MENET accurately predicts diagnostic labels and identify brain connectivities highly associated with neurological disorders such as Alzheimer's Disease and Attention-Deficit/Hyperactivity Disorder.	公開日:2024-09-26 翻訳日:2024-11-09 15:57:56
# 教師なしの学習表現:クエストは終わりか? Unsupervisedly Learned Representations: Should the Quest be Over? ( http://arxiv.org/abs/2001.07495v1 ) ライセンス: Link先を確認	Daniel N. Nissani (Nissensohn)	(参考訳) 研究から40年経っても、最良の教師なし学習表現法と知的動物が達成した精度率との間には、およそ20%の分類精度のギャップが残っている。したがって、間違った方向を向いているのかもしれない。このパズルの解法が提示される。強化学習が動物と同じ精度の表現を学習できることを実証する。私たちの主な貢献は、以下の観察にある。 a) 実環境に適用する場合は、強化学習はラベルを必要としないため、正当に教師なし学習とみなすことができる。対照的に、強化学習をシミュレーション環境で適用する場合は、本質的にラベルを必要とするため、一般的には監督学習とみなすべきである。これらの観察の要点は、シミュレーション環境で訓練される可能性のある教師なし学習の競争パラダイムのさらなる探索が無駄になる可能性があるということである。 After four decades of research there still exists a Classification accuracy gap of about 20% between our best Unsupervisedly Learned Representations methods and the accuracy rates achieved by intelligent animals. It thus may well be that we are looking in the wrong direction. A possible solution to this puzzle is presented. We demonstrate that Reinforcement Learning can learn representations which achieve the same accuracy as that of animals. Our main modest contribution lies in the observations that: a. when applied to a real world environment Reinforcement Learning does not require labels, and thus may be legitimately considered as Unsupervised Learning, and b. in contrast, when Reinforcement Learning is applied in a simulated environment it does inherently require labels and should thus be generally be considered as Supervised Learning. The corollary of these observations is that further search for Unsupervised Learning competitive paradigms which may be trained in simulated environments may be futile.	公開日:2024-09-26 翻訳日:2024-11-09 15:57:56
# 教師なしの学習表現:クエストは終わりか? Unsupervisedly Learned Representations: Should the Quest be Over? ( http://arxiv.org/abs/2001.07495v4 ) ライセンス: Link先を確認	Daniel N. Nissani,	(参考訳) 研究から40年経っても、最良の教師なし学習表現法と知的動物が達成した精度率との間には、およそ20%の分類精度のギャップが残っている。したがって、間違った方向を向いているのかもしれない。このパズルの解法が提示される。強化学習が動物と同じ精度の表現を学習できることを実証する。私たちの主な貢献は、以下の観察にある。 a) 実環境に適用する場合は、強化学習はラベルを必要としないため、正当に教師なし学習とみなすことができる。対照的に、強化学習をシミュレーション環境で適用する場合は、本質的にラベルを必要とするため、一般的には監督学習とみなすべきである。これらの観察の要点は、シミュレーション環境で訓練される可能性のある教師なし学習の競争パラダイムのさらなる探索が無駄になる可能性があるということである。 After four decades of research there still exists a Classification accuracy gap of about 20% between our best Unsupervisedly Learned Representations methods and the accuracy rates achieved by intelligent animals. It thus may well be that we are looking in the wrong direction. A possible solution to this puzzle is presented. We demonstrate that Reinforcement Learning can learn representations which achieve the same accuracy as that of animals. Our main modest contribution lies in the observations that: a. when applied to a real world environment Reinforcement Learning does not require labels, and thus may be legitimately considered as Unsupervised Learning, and b. in contrast, when Reinforcement Learning is applied in a simulated environment it does inherently require labels and should thus be generally be considered as Supervised Learning. The corollary of these observations is that further search for Unsupervised Learning competitive paradigms which may be trained in simulated environments may be futile.	公開日:2024-09-26 翻訳日:2024-11-09 15:57:56
# 教師なしの学習表現:クエストは終わりか? Unsupervisedly Learned Representations: Should the Quest be Over? ( http://arxiv.org/abs/2001.07495v5 ) ライセンス: Link先を確認	Daniel N. Nissani,	(参考訳) 研究から40年経っても、最良の教師なし学習表現法と知的動物が達成した精度率との間には、およそ20%の分類精度のギャップが残っている。したがって、間違った方向を向いているのかもしれない。このパズルの解法が提示される。強化学習が動物と同じ精度の表現を学習できることを実証する。私たちの主な貢献は、以下の観察にある。 a) 実環境に適用する場合は、強化学習はラベルを必要としないため、正当に教師なし学習とみなすことができる。対照的に、強化学習をシミュレーション環境で適用する場合は、本質的にラベルを必要とするため、一般的には監督学習とみなすべきである。これらの観察の要点は、シミュレーション環境で訓練される可能性のある教師なし学習の競争パラダイムのさらなる探索が無駄になる可能性があるということである。 After four decades of research there still exists a Classification accuracy gap of about 20% between our best Unsupervisedly Learned Representations methods and the accuracy rates achieved by intelligent animals. It thus may well be that we are looking in the wrong direction. A possible solution to this puzzle is presented. We demonstrate that Reinforcement Learning can learn representations which achieve the same accuracy as that of animals. Our main modest contribution lies in the observations that: a. when applied to a real world environment Reinforcement Learning does not require labels, and thus may be legitimately considered as Unsupervised Learning, and b. in contrast, when Reinforcement Learning is applied in a simulated environment it does inherently require labels and should thus be generally be considered as Supervised Learning. The corollary of these observations is that further search for Unsupervised Learning competitive paradigms which may be trained in simulated environments may be futile.	公開日:2024-09-26 翻訳日:2024-11-09 15:57:56
# 代数的クリプトアナリシスに関するフォーマルパワーシリーズ Formal Power Series on Algebraic Cryptanalysis ( http://arxiv.org/abs/2007.14729v3 ) ライセンス: Link先を確認	Shuhei Nakamura,	(参考訳) 多項式方程式の系を解くための暗号系を減少させる攻撃の複雑性推定において、第1の転落次数の正則度と上界は、しばしば暗号解析において用いられる。正則性の次数は半正則性仮定の下で単変量形式列を用いて容易に計算できるが、第1の転位次数の上界を決定するためには、入力システムの具体的なシジーを調べる必要がある。本稿では,多項式系における第1降下次数の上界を十分に大域にわたって検討する。この場合、非半正則系の第一降下次数は正則度で上界し、多階多項式系の第一落下次数は、多変量形式的級数列から決定される一定の値で上界することを示す。さらに、多項式系の最初の転倒次数を計算するための理論的な仮定を十分に大きな場上で提供する。 In the complexity estimation for an attack that reduces a cryptosystem to solving a system of polynomial equations, the degree of regularity and an upper bound of the first fall degree are often used in cryptanalysis. While the degree of regularity can be easily computed using a univariate formal power series under the semi-regularity assumption, determining an upper bound of the first fall degree requires investigating the concrete syzygies of an input system. In this paper, we investigate an upper bound of the first fall degree for a polynomial system over a sufficiently large field. In this case, we prove that the first fall degree of a non-semi-regular system is bounded above by the degree of regularity, and that the first fall degree of a multi-graded polynomial system is bounded above by a certain value determined from a multivariate formal power series. Moreover, we provide a theoretical assumption for computing the first fall degree of a polynomial system over a sufficiently large field.	公開日:2024-09-20 翻訳日:2024-11-09 15:57:56
# ゼロ知識ゲーム Zero Knowledge Games ( http://arxiv.org/abs/2009.13521v7 ) ライセンス: Link先を確認	Ian Malloy,	(参考訳) 本稿では,不完全なリコールと不完全な情報によって,全ての戦略が不完全であるようなゲームをモデル化する。また,リニアトランスフォーメーションとして修正されたスライディングブロックコードを導入し,プレイヤーの公開発表時の情報伝達に関する共通知識を生成する。最終的に、2つのプレイヤーまたは2つの連立関係の間に、両方のプレイヤーに知らせられるゼロ知識ゲームは、混合戦略ナッシュ均衡に確立された信頼の効力を持つ。ゼロ知識ゲームは信頼と健全性の1つである。非インフォームドの選手の場合、そのようなプレイヤーは非インフォームドであることを明らかにする。検証の意思」は、クレームが繰り返し虚偽のクレームの責任を負ったり、非インフォームされたりすることがないように浸食されることがある。 In this paper we model a game such that all strategies are non-revealing, with imperfect recall and incomplete information. We also introduce a modified sliding-block code as a linear transformation which generates common knowledge of how informed a player is under public announcements. Ultimately, we see that between two players or two coalitions; zero-knowledge games where both players are informed have the utility of trust established in the mixed strategy Nash equilibrium. A zero-knowledge game is one of trust and soundness, placing utility in being informed. For any player who may be uninformed, such players reveal they are uninformed. The "will to verify" may be eroded such that the claimant is never held responsible for their repeated false claims or being uninformed.	公開日:2024-09-22 翻訳日:2024-11-09 15:57:56
# 無線360度ビデオストリーミングのためのクロス層最適化と分散強化学習 Cross Layer Optimization and Distributed Reinforcement Learning for Wireless 360° Video Streaming ( http://arxiv.org/abs/2011.06356v3 ) ライセンス: Link先を確認	Anis Elgabli, Mohammed S. Elbamby, Cristina Perfecto, Mounssif Krouka, Mehdi Bennis, Vaneet Aggarwal,	(参考訳) ワイヤレスで高画質の360度ビデオをストリーミングすることは、今でも難しい問題だ。異なる360度ビデオを見たり、コンピューティングや通信リソースに競合するユーザがたくさんいる場合、ストリーミングアルゴリズムは、各ユーザに対して最小限のレートを保証しながら、平均品質(QoE)を最大化すべきである。本稿では,各ユーザに対して利用可能なレートを最大化し,ユーザのQoEを最大化するために効率的に利用するクロスレイヤ最適化手法を提案する。特にタイルベースの360度ビデオストリーミングを検討し、各ユーザのQoEの最大化とユーザ間の公正性の確保とのトレードオフをバランスさせるQoEメトリックを最適化する。この問題を2つの相互関連サブプロブレムに分解できることを示す。一利用者毎のダウンロード率を見つけることを目的とする物理層サブプロブレム二利用者のQoEが最大になるように、そのレートを用いてタイルごとの品質判定を行うことを目的とするアプリケーション層サブプロブレム。物理層サブプロブレムを低複雑性で最適に解き、複数の独立エージェントの並列トレーニングを活用してアプリケーション層サブプロブレムを解くためにアクタ・クリティカル・ディープ・強化学習(DRL)を提案する。大規模な実験により,提案手法の頑健さが明らかになり,いくつかのベースラインアルゴリズムと比較して顕著な性能向上が示された。 Wirelessly streaming high quality 360 degree videos is still a challenging problem. When there are many users watching different 360 degree videos and competing for the computing and communication resources, the streaming algorithm at hand should maximize the average quality of experience (QoE) while guaranteeing a minimum rate for each user. In this paper, we propose a cross layer optimization approach that maximizes the available rate to each user and efficiently uses it to maximize users' QoE. Particularly, we consider a tile based 360 degree video streaming, and we optimize a QoE metric that balances the tradeoff between maximizing each user's QoE and ensuring fairness among users. We show that the problem can be decoupled into two interrelated subproblems: (i) a physical layer subproblem whose objective is to find the download rate for each user, and (ii) an application layer subproblem whose objective is to use that rate to find a quality decision per tile such that the user's QoE is maximized. We prove that the physical layer subproblem can be solved optimally with low complexity and an actor-critic deep reinforcement learning (DRL) is proposed to leverage the parallel training of multiple independent agents and solve the application layer subproblem. Extensive experiments reveal the robustness of our scheme and demonstrate its significant performance improvement compared to several baseline algorithms.	公開日:2024-09-24 翻訳日:2024-11-09 15:57:56
# チェッカーボード反強磁性体のほぼ退化状態とそのボソニック解釈 Nearly degenerate ground states of a checkerboard antiferromagnet and their bosonic interpretation ( http://arxiv.org/abs/2011.06520v2 ) ライセンス: Link先を確認	Haiyuan Zou, Fan Yang, Wei Ku,	(参考訳) J_1$-$J_2$チェッカーボード格子上の反強磁性(AF)カップリングを持つスピン-$1/2$モデル系は、平面ピロクロアモデルとして知られ、強いフラストレーションを伴い、2次元から1次元のクロスオーバーと結びついている。 Projected Entangled Simplex States tensor network ansatz を用いて、フラストレーション領域 (J_1<J_2$) におけるほぼ退化状態の多数を同定する。具体的には、長寿命クロスダイマー価結合固体(VBS)が、J_1\lesssim J_2$の基底状態であるのに対して、1D AF相関状態が残りを乗っ取る。ネマティック摂動に対するVBS状態の安定性を検証する。対応するボゾン像は低エネルギー物理学の直感的な理解を与える。特に,VBS状態がより弱いことを予測し,数値的に確認する。本研究は, この興味深いシステムの最も重要な基底状態特性を明らかにし, フラストレーション磁化処理におけるボゾン像の有用性を実証するものである。 The spin-$1/2$ model system with antiferromagnetic (AF) couplings on a $J_1$-$J_2$ checkerboard lattice, known as the planar pyrochlore model, is strongly frustrated and associated with a two-to-one dimensional crossover. Using the Projected Entangled Simplex States tensor network ansatz, we identify a large number of nearly degenerate states in the frustrated region ($J_1<J_2$). Specifically, we find the long-sought crossed-dimer valence bond solid (VBS) state to be the ground state at $J_1\lesssim J_2$, while various 1D AF correlated states take over the rest. We verify the stability of the VBS state against nematic perturbation. The corresponding bosonic picture provides an intuitive understanding of the low-energy physics. Particularly, it predicts weaker VBS states in the easy-plane limit, which we confirm numerically. Our results clarify the most essential ground state properties of this interesting system and demonstrate the usefulness of bosonic picture in dealing with frustrated magnetism.	公開日:2024-09-24 翻訳日:2024-11-09 15:57:56
# 高次元データに関する講義ノート Lecture notes on high-dimensional data ( http://arxiv.org/abs/2101.05841v7 ) ライセンス: Link先を確認	Sven-Ake Wegner,	(参考訳) 以下は、2019-2020年にイギリスでBScの学生に教えた「数学データサイエンス」の講座の最初の部分に基づく講義ノートである。トピックは、高次元における測度集中、高次元におけるガウス確率ベクトル、乱射影、ガウスデータの分離・分離である。改訂版が教科書 (Mathematical Introduction to Data Science, Springer, Berlin, Heidelberg, 2024, https://link.springer.com/book/10.1007/978-3-662-69426-8] の一部として出版された。 These are lecture notes based on the first part of a course on 'Mathematical Data Science', which I taught to final year BSc students in the UK in 2019-2020. Topics include: concentration of measure in high dimensions; Gaussian random vectors in high dimensions; random projections; separation/disentangling of Gaussian data. A revised version has been published as part of the textbook [Mathematical Introduction to Data Science, Springer, Berlin, Heidelberg, 2024, https://link.springer.com/book/10.1007/978-3-662-69426-8].	公開日:2024-09-20 翻訳日:2024-11-09 15:57:56
# 加速法 Acceleration Methods ( http://arxiv.org/abs/2101.09545v1 ) ライセンス: Link先を確認	Alexandre d'Aspremont, Damien Scieur and Adrien Taylor	(参考訳) このモノグラフは、凸最適化に頻繁に使用される加速技術における最近の進歩をカバーしている。まず、2次最適化問題を用いて、モーメントとネスト最適化スキームという2つの主要な手法群を導入する。これらは二次の場合と一致してチェビシェフ法を形成する。モーメント法について、ネステロフのセミナルな研究から始まり、最適化された勾配法のようないくつかのマスターテンプレートを用いて構造収束証明を議論し、モーメント法が収束保証をいかに最適化するかを示す重要な利点を提供する。さらに、同様のアルゴリズムパターンを用いて、CatalystおよびAccelerated Hybrid Proximal Extragradientフレームワークの心臓部において、近位加速度をさらにカバーする。一般的な加速技術は、目の前の問題における正則性パラメータの知識に直接依存する。我々は、観測されない正則性パラメータに適応しつつ、ほぼ最適な収束率に達するための一連の簡単な手法である再起動スキームを議論することで結論付ける。 This monograph covers some recent advances in a range of acceleration techniques frequently used in convex optimization. We first use quadratic optimization problems to introduce two key families of methods, namely momentum and nested optimization schemes. They coincide in the quadratic case to form the Chebyshev method. We discuss momentum methods in detail, starting with the seminal work of Nesterov and structure convergence proofs using a few master templates, such as that for optimized gradient methods, which provide the key benefit of showing how momentum methods optimize convergence guarantees. We further cover proximal acceleration, at the heart of the Catalyst and Accelerated Hybrid Proximal Extragradient frameworks, using similar algorithmic patterns. Common acceleration techniques rely directly on the knowledge of some of the regularity parameters in the problem at hand. We conclude by discussing restart schemes, a set of simple techniques for reaching nearly optimal convergence rates while adapting to unobserved regularity parameters.	公開日:2024-09-24 翻訳日:2024-11-09 15:57:56
# 加速法 Acceleration Methods ( http://arxiv.org/abs/2101.09545v4 ) ライセンス: Link先を確認	Alexandre d'Aspremont, Damien Scieur, Adrien Taylor,	(参考訳) このモノグラフは、凸最適化に頻繁に使用される加速技術における最近の進歩をカバーしている。まず、2次最適化問題を用いて、モーメントとネスト最適化スキームという2つの主要な手法群を導入する。これらは二次の場合と一致してチェビシェフ法を形成する。モーメント法について、ネステロフのセミナルな研究から始まり、最適化された勾配法のようないくつかのマスターテンプレートを用いて構造収束証明を議論し、モーメント法が収束保証をいかに最適化するかを示す重要な利点を提供する。さらに、同様のアルゴリズムパターンを用いて、CatalystおよびAccelerated Hybrid Proximal Extragradientフレームワークの心臓部において、近位加速度をさらにカバーする。一般的な加速技術は、目の前の問題における正則性パラメータの知識に直接依存する。我々は、観測されない正則性パラメータに適応しつつ、ほぼ最適な収束率に達するための一連の簡単な手法である再起動スキームを議論することで結論付ける。 This monograph covers some recent advances in a range of acceleration techniques frequently used in convex optimization. We first use quadratic optimization problems to introduce two key families of methods, namely momentum and nested optimization schemes. They coincide in the quadratic case to form the Chebyshev method. We discuss momentum methods in detail, starting with the seminal work of Nesterov and structure convergence proofs using a few master templates, such as that for optimized gradient methods, which provide the key benefit of showing how momentum methods optimize convergence guarantees. We further cover proximal acceleration, at the heart of the Catalyst and Accelerated Hybrid Proximal Extragradient frameworks, using similar algorithmic patterns. Common acceleration techniques rely directly on the knowledge of some of the regularity parameters in the problem at hand. We conclude by discussing restart schemes, a set of simple techniques for reaching nearly optimal convergence rates while adapting to unobserved regularity parameters.	公開日:2024-09-24 翻訳日:2024-11-09 15:57:56
# フラクタル上のスピン-1/2ハイゼンベルク反強磁性体におけるギャップレススピン液体と非局所コーナー励起 Gapless Spin Liquid and Non-local Corner Excitation in the Spin-1/2 Heisenberg Antiferromagnet on Fractal ( http://arxiv.org/abs/2105.12487v2 ) ライセンス: Link先を確認	Haiyuan Zou, Wei Wang,	(参考訳) フラクタル系の数学的美しさと最近の実験的実現により、スピン-$1/2$反強磁性ハイゼンベルク模型をSierpi\nskiガスケット上で研究した。フラクタル多孔質の特徴は、エキゾチックな量子状態を示す新しい種類のフラストレーションを生み出す。先進テンソルネットワーク技術を用いて,分数空間次元における量子ギャップレス-スピン-液体基底状態の同定を行う。このフラクタルスピン系は非自明な非局所的性質も示している。超短距離相関は、非常に縮退したスピン形成因子を引き起こすが、このフラクタル系の絡み合いは長距離スケーリングの挙動を示唆している。また, 動的構造因子について検討し, 基底状態の絡み目から生じる安定なコーナー励起によるギャップレス励起を明らかにした。我々の結果は、このフラクタルスピンシステムの複数の重要な性質を不明瞭に指摘し、スピン液体とフラストレーション磁石を探索する新たな経路を開く。 Motivated by the mathematical beauty and the recent experimental realizations of fractal systems, we study the spin-$1/2$ antiferromagnetic Heisenberg model on a Sierpi\'nski gasket. The fractal porous feature generates new kinds of frustration to exhibit exotic quantum states. Using advanced tensor network techniques, we identify a quantum gapless-spin-liquid ground state in fractional spatial dimension. This fractal spin system also demonstrates nontrivial non-local properties. While the extremely short-range correlation causes a highly degenerate spin form factor, the entanglement in this fractal system suggests a long-range scaling behavior. We also study the dynamic structure factor and clearly identify the gapless excitation with a stable corner excitation emerged from the ground-state entanglement. Our results unambiguously point out multiple essential properties of this fractal spin system, and open a new route to explore spin liquid and frustrated magnetism.	公開日:2024-09-24 翻訳日:2024-11-09 15:57:56
# 直交性制約問題に対する高速ランダム化法 Faster Randomized Methods for Orthogonality Constrained Problems ( http://arxiv.org/abs/2106.12060v2 ) ライセンス: Link先を確認	Boris Shustin, Haim Avron,	(参考訳) 近年の文献では、データサイエンスや計算科学を通じて生じる様々な行列問題の解法を高速化するためのランダム化手法の使用が提唱されている。ランダム化を利用する一般的な戦略の1つは、問題のサイズを減らす方法として使うことである。しかし、この戦略に基づく手法は、いくつかのアプリケーションに十分な精度を欠いている。ランダム化プレコンディショニング(Randomized preconditioning)は、より高精度なランダム化手法である。乱数化プレコンディショニングの最大の課題は、根底にある反復的手法の必要性であり、そのため、これまでは回帰問題や線形システムにのみランダム化プレコンディショニングが適用されてきた。本稿では、乱数化前提条件の適用を、データサイエンスで広く普及している別の重要な問題、すなわち(一般化された)直交制約による最適化問題にどのように拡張するかを示す。我々は、リーマン最適化とリーマン事前条件の枠組みに基づく、支配的な正準相関の計算問題とフィッシャー線形判別分析問題に基づくアプローチを実証する。両問題に対して,プレコンディショニングが計算コストと漸近収束に及ぼす影響を評価し,本手法の有効性を実証的に示す。 Recent literature has advocated the use of randomized methods for accelerating the solution of various matrix problems arising throughout data science and computational science. One popular strategy for leveraging randomization is to use it as a way to reduce problem size. However, methods based on this strategy lack sufficient accuracy for some applications. Randomized preconditioning is another approach for leveraging randomization, which provides higher accuracy. The main challenge in using randomized preconditioning is the need for an underlying iterative method, thus randomized preconditioning so far have been applied almost exclusively to solving regression problems and linear systems. In this article, we show how to expand the application of randomized preconditioning to another important set of problems prevalent across data science: optimization problems with (generalized) orthogonality constraints. We demonstrate our approach, which is based on the framework of Riemannian optimization and Riemannian preconditioning, on the problem of computing the dominant canonical correlations and on the Fisher linear discriminant analysis problem. For both problems, we evaluate the effect of preconditioning on the computational costs and asymptotic convergence, and demonstrate empirically the utility of our approach.	公開日:2024-09-26 翻訳日:2024-11-09 15:57:56
# 深部線形ニューラルネットワークのロスランドスケープ--第2次分析 The loss landscape of deep linear neural networks: a second-order analysis ( http://arxiv.org/abs/2107.13289v1 ) ライセンス: Link先を確認	El Mehdi Achour, Fran\c{c}ois Malgouyres (IMT), S\'ebastien Gerchinovitz (IMT)	(参考訳) 正方形損失を伴う深部線形ニューラルネットワークの最適化環境について検討する。弱い仮定の下では、急激な局所ミニマは存在せず、局所的な極小マも存在しないことが知られている。しかし、一階アルゴリズムの力学において重要な役割を果たしうる非制限サドル点の存在と多様性は、わずかに研究されているだけである。最適化の展望を順2で完全に分析し、さらに一歩進める。我々は、すべての臨界点の中で、大域最小化点、厳格なサドル点、非制限サドル点を特徴づける。関連するすべての臨界値を列挙する。特徴付けは単純で、部分行列積のランクの条件を伴い、線形ニューラルネットワークを最適化する際に証明または観察された大域収束や暗黙の正則化にいくらか光を当てる。通過において、全大域最小化器の集合の明示的なパラメータ化を提供し、厳密で非制限的なサドル点の集合を示す。 We study the optimization landscape of deep linear neural networks with the square loss. It is known that, under weak assumptions, there are no spurious local minima and no local maxima. However, the existence and diversity of non-strict saddle points, which can play a role in first-order algorithms' dynamics, have only been lightly studied. We go a step further with a full analysis of the optimization landscape at order 2. We characterize, among all critical points, which are global minimizers, strict saddle points, and non-strict saddle points. We enumerate all the associated critical values. The characterization is simple, involves conditions on the ranks of partial matrix products, and sheds some light on global convergence or implicit regularization that have been proved or observed when optimizing linear neural networks. In passing, we provide an explicit parameterization of the set of all global minimizers and exhibit large sets of strict and non-strict saddle points.	公開日:2024-09-25 翻訳日:2024-11-09 15:57:56
# 深部線形ニューラルネットワークのロスランドスケープ:2次解析 The loss landscape of deep linear neural networks: a second-order analysis ( http://arxiv.org/abs/2107.13289v3 ) ライセンス: Link先を確認	El Mehdi Achour, François Malgouyres, Sébastien Gerchinovitz,	(参考訳) 正方形損失を伴う深部線形ニューラルネットワークの最適化環境について検討する。弱い仮定の下では、急激な局所ミニマは存在せず、局所的な極小マも存在しないことが知られている。しかし、一階アルゴリズムの力学において重要な役割を果たしうる非制限サドル点の存在と多様性は、わずかに研究されているだけである。最適化の展望を順2で完全に分析し、さらに一歩進める。我々は、すべての臨界点の中で、大域最小化点、厳格なサドル点、非制限サドル点を特徴づける。関連するすべての臨界値を列挙する。特徴付けは単純で、部分行列積のランクの条件を伴い、線形ニューラルネットワークを最適化する際に証明または観察された大域収束や暗黙の正則化にいくらか光を当てる。通過において、全大域最小化器の集合の明示的なパラメータ化を提供し、厳密で非制限的なサドル点の集合を示す。 We study the optimization landscape of deep linear neural networks with the square loss. It is known that, under weak assumptions, there are no spurious local minima and no local maxima. However, the existence and diversity of non-strict saddle points, which can play a role in first-order algorithms' dynamics, have only been lightly studied. We go a step further with a full analysis of the optimization landscape at order 2. We characterize, among all critical points, which are global minimizers, strict saddle points, and non-strict saddle points. We enumerate all the associated critical values. The characterization is simple, involves conditions on the ranks of partial matrix products, and sheds some light on global convergence or implicit regularization that have been proved or observed when optimizing linear neural networks. In passing, we provide an explicit parameterization of the set of all global minimizers and exhibit large sets of strict and non-strict saddle points.	公開日:2024-09-25 翻訳日:2024-11-09 15:57:56
# LAViTeR:画像とキャプション生成による視覚・テキスト表現の学習 LAViTeR: Learning Aligned Visual and Textual Representations Assisted by Image and Caption Generation ( http://arxiv.org/abs/2109.04993v3 ) ライセンス: Link先を確認	Mohammad Abuzar Hashemi, Zhanghexuan Li, Mihir Chauhan, Yan Shen, Abhishek Satbhai, Mir Basheer Ali, Mingchen Gao, Sargur Srihari,	(参考訳) 大規模な画像テキストペアからの視覚的およびテキスト的表現の事前学習は、多くの下流視覚言語タスクの標準的アプローチになりつつある。トランスフォーマーベースのモデルは、自己教師付き学習タスクのリストを通じて、モーダル内およびモーダル内注意を学習する。本稿では,視覚およびテキスト表現学習のための新しいアーキテクチャであるLAViTeRを提案する。メインモジュールであるVisual Textual Alignment (VTA)は、GANベースの画像合成とイメージキャプションという2つの補助的なタスクによって支援される。また,学習した視覚とテキストの埋め込みの類似度を計測する新しい評価指標を提案する。 CUBとMS-COCOの2つの公開データセットによる実験結果から、関節機能埋め込み空間における視覚的およびテキスト的表現のアライメントが優れていることが示された。 Pre-training visual and textual representations from large-scale image-text pairs is becoming a standard approach for many downstream vision-language tasks. The transformer-based models learn inter and intra-modal attention through a list of self-supervised learning tasks. This paper proposes LAViTeR, a novel architecture for visual and textual representation learning. The main module, Visual Textual Alignment (VTA) will be assisted by two auxiliary tasks, GAN-based image synthesis and Image Captioning. We also propose a new evaluation metric measuring the similarity between the learnt visual and textual embedding. The experimental results on two public datasets, CUB and MS-COCO, demonstrate superior visual and textual representation alignment in the joint feature embedding space	公開日:2024-10-01 翻訳日:2024-11-09 15:57:56
# 画像属性編集のための高忠実GANインバージョン High-Fidelity GAN Inversion for Image Attribute Editing ( http://arxiv.org/abs/2109.06590v4 ) ライセンス: Link先を確認	Tengfei Wang, Yong Zhang, Yanbo Fan, Jue Wang, Qifeng Chen,	(参考訳) 本稿では, 画像固有の細部(背景, 外観, 照明など)をよく保存した属性編集が可能な, GAN(High-fidelity Generative Adversarial Network)インバージョンフレームワークを提案する。まず、損失データ圧縮の観点から、高忠実度GAN逆変換の課題を解析する。低ビットレートの遅延符号では、再構成された画像や編集された画像の高忠実度の詳細を保存することは困難である。遅延コードのサイズを増やすことで、GAN変換の精度が向上するが、編集性は劣る。編集性を損なうことなく画像の忠実度を向上させるために,歪みマップを高忠実度再構成の基準として用いた歪みコンサルテーション手法を提案する。歪みコンサルテーションインバージョン (DCI) において、歪みマップは最初、高いレートの潜時写像に投影され、次に、基本的な低レート潜時符号を、より詳細なコンサルテーション融合によって補完する。高忠実度編集を実現するために,編集画像と反転画像のギャップを埋める自己教師付きトレーニングスキームを用いた適応歪みアライメント(ADA)モジュールを提案する。顔領域と車領域における大規模な実験は、インバージョンと編集品質の両方において明らかに改善されている。 We present a novel high-fidelity generative adversarial network (GAN) inversion framework that enables attribute editing with image-specific details well-preserved (e.g., background, appearance, and illumination). We first analyze the challenges of high-fidelity GAN inversion from the perspective of lossy data compression. With a low bit-rate latent code, previous works have difficulties in preserving high-fidelity details in reconstructed and edited images. Increasing the size of a latent code can improve the accuracy of GAN inversion but at the cost of inferior editability. To improve image fidelity without compromising editability, we propose a distortion consultation approach that employs a distortion map as a reference for high-fidelity reconstruction. In the distortion consultation inversion (DCI), the distortion map is first projected to a high-rate latent map, which then complements the basic low-rate latent code with more details via consultation fusion. To achieve high-fidelity editing, we propose an adaptive distortion alignment (ADA) module with a self-supervised training scheme, which bridges the gap between the edited and inversion images. Extensive experiments in the face and car domains show a clear improvement in both inversion and editing quality.	公開日:2024-09-27 翻訳日:2024-11-09 15:57:56
# ゴールデンデリケートアップルにおける酵素ブルーニング欠陥検出のための新しい簡易可視化アルゴリズム A New Simple Vision Algorithm for Detecting the Enzymic Browning Defects in Golden Delicious Apples ( http://arxiv.org/abs/2110.03574v2 ) ライセンス: Link先を確認	Hamid Majidi Balanji,	(参考訳) 本研究は, 酵素的玄米処理によるゴールデンデリシスリンゴの表面欠陥を抽出し, 同定するために, 簡単な視覚アルゴリズムを設計, 実装した。実験では34種類のゴールデン・デリシアスリンゴが選択され、そのうち17個は酵素的染料欠陥があり、残りの17個は音が聞こえた。提案した視覚アルゴリズムの画像処理部は, リンゴの欠陥表面積を97.15%の精度で抽出した。分割画像の面積と平均は、2x1特徴ベクトルとして選択され、設計された人工ニューラルネットワークに入力される。以上の特徴から, 平均0.0065以下の画像は, 欠陥リンゴに属さないことが明らかとなった。本研究で適用されたニューラルネットワークの分類精度は99.19%であった。 In this work, a simple vision algorithm is designed and implemented to extract and identify the surface defects on the Golden Delicious apples caused by the enzymic browning process. 34 Golden Delicious apples were selected for the experiments, of which 17 had enzymic browning defects and the other 17 were sound. The image processing part of the proposed vision algorithm extracted the defective surface area of the apples with high accuracy of 97.15%. The area and mean of the segmented images were selected as the 2x1 feature vectors to feed into a designed artificial neural network. The analysis based on the above features indicated that the images with a mean less than 0.0065 did not belong to the defective apples; rather, they were extracted as part of the calyx and stem of the healthy apples. The classification accuracy of the neural network applied in this study was 99.19%	公開日:2024-09-22 翻訳日:2024-11-09 15:57:56
# 適応的関節分布学習 Adaptive joint distribution learning ( http://arxiv.org/abs/2110.04829v5 ) ライセンス: Link先を確認	Damir Filipovic, Michael Multerer, Paul Schneider,	(参考訳) テンソル積再生カーネルヒルベルト空間 (RKHS) を用いた共同確率分布推定のための新しいフレームワークを開発した。我々のフレームワークはRKHSモデルの本質的な制約を緩和し、最大数百万のサンプルサイズから推定するラドン-ニコディム誘導体の低次元、正規化、正のモデルに対応している。明確に定義された正規化条件分布と正条件分布は、我々のアプローチの自然な副産物である。提案手法は,予測から分類までの学習問題を高速に計算し,対応できる。理論的な結果は好意的な数値結果によって補われている。 We develop a new framework for estimating joint probability distributions using tensor product reproducing kernel Hilbert spaces (RKHS). Our framework accommodates a low-dimensional, normalized and positive model of a Radon--Nikodym derivative, which we estimate from sample sizes of up to several millions, alleviating the inherent limitations of RKHS modeling. Well-defined normalized and positive conditional distributions are natural by-products to our approach. Our proposal is fast to compute and accommodates learning problems ranging from prediction to classification. Our theoretical findings are supplemented by favorable numerical results.	公開日:2024-09-24 翻訳日:2024-11-09 15:57:56
# 科学者はどのようにしてオブザーバーに依存しない科学を確立することができるのか? How can scientists establish an observer-independent science? Embodied cognition, consciousness and quantum mechanics ( http://arxiv.org/abs/2112.15428v3 ) ライセンス: Link先を確認	John Realpe-Gómez,	(参考訳) エビデンス(エビデンス)は、その行動と知覚が互いに一致して決定し、行動知覚ループを形成する、体現認知の理論のために成長している。これは、人間が何らかの形で知覚するものに参加することを示唆している。では、どのようにして科学者が行動知覚ループから逃れて、世界の観察者に依存しない説明を得ることができるのか? ここでは、心の哲学と科学と量子物理学のリバースエンジニアリングから得られる一連の予想を提示し、この問題を探求する。我々は、エンボディメントが伝統的に理解されているように、想像時間量子力学の側面を示すことができると論じる。次に、真にリアルタイムな量子力学の側面を得るのに必要な追加の制約について検討する。特に、実験を行う実施科学者は、認知を具現化するための従来のアプローチでは無視されている他の科学者の視点から説明されなければならないと推測し、観察者は、他の観察者が経験する対象と、他の観察対象を経験する「対象」の両方として補完的な役割を担わなければならない。 Evidence is growing for the theory of embodied cognition, which posits that action and perception co-determine each other, forming an action-perception loop. This suggests that we humans somehow participate in what we perceive. So, how can scientists escape the action-perception loop to obtain an observer-independent description of the world? Here we present a set of conjectures informed by the philosophy of mind and a reverse-engineering of science and quantum physics to explore this question. We argue that embodiment, as traditionally understood, can manifest aspects of imaginary-time quantum dynamics. We then explore what additional constraints are required to obtain aspects of genuine, real-time quantum dynamics. In particular, we conjecture that an embodied scientist doing experiments must be described from the perspective of another scientist, which is ignored in traditional approaches to embodied cognition, and that observers play complementary roles as both objects experienced by other observers and ``subjects'' that experience other objects.	公開日:2024-09-27 翻訳日:2024-11-09 15:57:56
# より高速なグラディエントバリアントを用いたプライバシー保護ロジスティック回帰トレーニング Privacy-Preserving Logistic Regression Training with A Faster Gradient Variant ( http://arxiv.org/abs/2201.10838v9 ) ライセンス: Link先を確認	John Chiang,	(参考訳) 暗号化されたデータに対するロジスティック回帰のトレーニングは、セキュリティ上の問題に何年も取り組んできた。本稿では、プライバシー保護ロジスティック回帰トレーニングのための効率的な勾配変種である$quadratic$$gradient$を紹介する。我々は,Nesterov の Accelerated Gradient (NAG),Adaptive Gradient Algorithm (Adagrad) およびAdamアルゴリズムを2次勾配を組み込んで拡張し,これらの改良アルゴリズムを様々なデータセット上で評価する。実験により, 従来の1次勾配法と比較して, 改良アルゴリズムは収束速度を著しく向上することを示した。さらに,同相ロジスティック回帰学習の実装に改良NAG法を適用し,わずか4回の反復で同等の結果を得ることができた。二次勾配法は2階のニュートン・ラフソン法と1階の勾配勾配勾配/上昇アルゴリズムを統合することができ、幅広い数値最適化問題に適用できる可能性は高い。 Training logistic regression over encrypted data has been a compelling approach in addressing security concerns for several years. In this paper, we introduce an efficient gradient variant, called $quadratic$ $gradient$, for privacy-preserving logistic regression training. We enhance Nesterov's Accelerated Gradient (NAG), Adaptive Gradient Algorithm (Adagrad) and Adam algorithms by incorporating their quadratic gradients and evaluate these improved algorithms on various datasets. Experimental results demonstrate that the enhanced algorithms achieve significantly improved convergence speed compared to traditional first-order gradient methods. Moreover, we applied the enhanced NAG method to implement homomorphic logistic regression training, achieving comparable results within just 4 iterations. There is a good chance that the quadratic gradient approach could integrate first-order gradient descent/ascent algorithms with the second-order Newton-Raphson methods, and that it could be applied to a wide range of numerical optimization problems.	公開日:2024-09-22 翻訳日:2024-11-09 15:46:48
# ZXダイアグラムの微分積分と量子機械学習への応用 Differentiating and Integrating ZX Diagrams with Applications to Quantum Machine Learning ( http://arxiv.org/abs/2201.13250v7 ) ライセンス: Link先を確認	Quanlong Wang, Richie Yeung, Mark Koch,	(参考訳) ZX計算は、幅広い応用が成功した量子技術にとって有用なツールであることが証明されている。これらの応用のほとんどは代数的性質のものである。しかし、差別化と統合を含む他のタスクは、現在のZX技術では到達できないままである。ここでは、ZX-計算の枠組み内での微分と積分を実現することにより、ZXを解析的視点に高める。本稿では,バレンプラトーの解析に量子機械学習を応用し,ZX計算の新しい解析フレームワークを具体的に解説する。 ZX-calculus has proved to be a useful tool for quantum technology with a wide range of successful applications. Most of these applications are of an algebraic nature. However, other tasks that involve differentiation and integration remain unreachable with current ZX techniques. Here we elevate ZX to an analytical perspective by realising differentiation and integration entirely within the framework of ZX-calculus. We explicitly illustrate the new analytic framework of ZX-calculus by applying it in context of quantum machine learning for the analysis of barren plateaus.	公開日:2024-09-25 翻訳日:2024-11-09 15:46:48
# 低ビットレート映像理解のための符号化フレームワークとベンチマーク A Coding Framework and Benchmark towards Low-Bitrate Video Understanding ( http://arxiv.org/abs/2202.02813v3 ) ライセンス: Link先を確認	Yuan Tian, Guo Lu, Yichao Yan, Guangtao Zhai, Li Chen, Zhiyong Gao,	(参考訳) ビデオ圧縮は、ほとんどのビデオ分析システムにとって不可欠である。転送帯域を節約しているにもかかわらず、特に低ビットレート設定では、下流のビデオ理解タスクも悪化する。この問題を体系的に検討するために,我々はまず,従来の手法,すなわちタスク分離,ラベルなし,データエマージされたセマンティクスという3つの原則が,マシンフレンドリーなコーディングフレームワークにとって重要であるが,今のところ完全に満足していないことを明らかにした。本稿では,従来のコーデックとニューラルネットワーク(NN)の両方を活用することによって,これらすべての原則を同時に満たす従来型ニューラル混合コーディングフレームワークを提案する。一方、従来のコーデックはビデオのピクセル信号を効率的に符号化できるが、意味情報を歪ませることもある。一方、高非線形NNは、ビデオセマンティクスをコンパクトな表現に凝縮するのに熟練している。このフレームワークは、自己管理された方法でラベルのないデータから自発的に学習されるコーディング手順に、動画の移動効率のよい意味表現が保存されることを保証することで最適化される。 2つのストリーム(コーデックとNN)から共同でデコードされたビデオは、リッチなセマンティクスを持ち、視覚的に写真リアリスティックであり、いくつかの主流のダウンストリームビデオ分析タスクのパフォーマンスを、後処理なしで実証的に向上させる。さらに,アテンション機構とアダプティブ・モデリング・スキームを導入することで,本手法の映像セマンティック・モデリング能力をさらに強化する。最後に、8つのデータセット上の3つの下流タスクを備えた低ビットレートビデオ理解ベンチマークを構築し、我々のアプローチの顕著な優位性を実証した。すべてのコード、データ、モデルは、 \url{https://github.com/tianyuan168326/VCS-Pytorch}で利用可能である。 Video compression is indispensable to most video analysis systems. Despite saving transportation bandwidth, it also deteriorates downstream video understanding tasks, especially at low-bitrate settings. To systematically investigate this problem, we first thoroughly review the previous methods, revealing that three principles, i.e., task-decoupled, label-free, and data-emerged semantic prior, are critical to a machine-friendly coding framework but are not fully satisfied so far. In this paper, we propose a traditional-neural mixed coding framework that simultaneously fulfills all these principles, by taking advantage of both traditional codecs and neural networks (NNs). On one hand, the traditional codecs can efficiently encode the pixel signal of videos but may distort the semantic information. On the other hand, highly non-linear NNs are proficient in condensing video semantics into a compact representation. The framework is optimized by ensuring that a transportation-efficient semantic representation of the video is preserved w.r.t. the coding procedure, which is spontaneously learned from unlabeled data in a self-supervised manner. The videos collaboratively decoded from two streams (codec and NN) are of rich semantics, as well as visually photo-realistic, empirically boosting several mainstream downstream video analysis task performances without any post-adaptation procedure. Furthermore, by introducing the attention mechanism and adaptive modeling scheme, the video semantic modeling ability of our approach is further enhanced. Finally, we build a low-bitrate video understanding benchmark with three downstream tasks on eight datasets, demonstrating the notable superiority of our approach. All codes, data, and models will be available at \url{https://github.com/tianyuan168326/VCS-Pytorch}.	公開日:2024-09-22 翻訳日:2024-11-09 15:46:48
# 逐次実験に対する実測的推論 Counterfactual inference for sequential experiments ( http://arxiv.org/abs/2202.06891v4 ) ライセンス: Link先を確認	Raaz Dwivedi, Katherine Tian, Sabina Tomkins, Predrag Klasnja, Susan Murphy, Devavrat Shah,	(参考訳) 複数の単位が時間とともに適応する処理ポリシーを用いて、複数の時間点に対する処理を割り当てるシーケンシャルな設計実験のアフタースタディ統計的推論を考察する。我々の目標は、最小限の可能な規模(各単位と各単位の異なる処理の下での平均結果)で、適応的な処理ポリシーに関する最小限の仮定で、カウンターファクト平均に対する推論保証を提供することです。反事実的手段に関する構造的な仮定がなければ、この課題は観測されたデータポイントよりも多くの未知のために実現不可能である。そこで本研究では,非線形混合効果モデルの非パラメトリック一般化と,先行研究で考慮された双線形潜在因子モデルの非パラメトリック一般化として機能する潜在因子モデルを提案する。推定には、近辺の変種である非パラメトリック法を用い、各単位と各時間に対する対実平均に対して非漸近的高確率誤差を定めている。正規性条件の下では、この境界は、単位数と時間点が適切な速度で一緒に$\infty$に増加するにつれて、反ファクトリアル平均に対する漸近的に妥当な信頼区間をもたらす。我々は,いくつかのシミュレーションと,モバイル医療臨床試験HeartStepsのデータを含むケーススタディを通して,我々の理論を解説する。 We consider after-study statistical inference for sequentially designed experiments wherein multiple units are assigned treatments for multiple time points using treatment policies that adapt over time. Our goal is to provide inference guarantees for the counterfactual mean at the smallest possible scale -- mean outcome under different treatments for each unit and each time -- with minimal assumptions on the adaptive treatment policy. Without any structural assumptions on the counterfactual means, this challenging task is infeasible due to more unknowns than observed data points. To make progress, we introduce a latent factor model over the counterfactual means that serves as a non-parametric generalization of the non-linear mixed effects model and the bilinear latent factor model considered in prior works. For estimation, we use a non-parametric method, namely a variant of nearest neighbors, and establish a non-asymptotic high probability error bound for the counterfactual mean for each unit and each time. Under regularity conditions, this bound leads to asymptotically valid confidence intervals for the counterfactual mean as the number of units and time points grows to $\infty$ together at suitable rates. We illustrate our theory via several simulations and a case study involving data from a mobile health clinical trial HeartSteps.	公開日:2024-09-22 翻訳日:2024-11-09 15:46:48
# フレキシブル匿名ネットワークを目指して Towards Flexible Anonymous Networks ( http://arxiv.org/abs/2203.03764v4 ) ライセンス: Link先を確認	Florentin Rochet, Jules Dejaeghere, Tariq Elahi,	(参考訳) Torのような匿名通信設計は、様々なグローバルな場所でリレーを走らせる多くのボランティアに対して、分散信頼に基づくセキュリティを構築している。実際には、この分布はTorソフトウェアの多くのバージョンが共存する異種ネットワークにつながり、それぞれ異なるプロトコル機能を持つ。この異種性のため、Tor開発者はネットワークの拡張性を維持する戦略として、前方互換のプロトコル設計を採用する。この戦略は、Torソフトウェアの異なるバージョンが、発見不可能なエラーなしに相互作用することを保証することを目的としている。本研究は,プロトコルの基本的なセキュリティ問題として,前方互換性のあるプロトコルの考慮によって実現されるプロトコル寛容を論じる。私たちは、開発者にとって有益である一方で、プロトコルの寛容さは、過去15年間にTorに対する強力な攻撃を引き起こしている、と論じています。この問題に対処するために、Flexible Anonymous Network (FAN)を提案する。これはボランティアベースの分散ネットワークのための新しいソフトウェアアーキテクチャで、開発者がソフトウェアを継続的に進化させる能力を失うことなく、依存関係をプロトコル寛容からシフトさせる。我が家一実施のインスタンスを作成すること二そのオーバーヘッドを評価して、三今もなおTorに当てはまる重度の攻撃に対して防衛するためのFANの利益のいくつかを実験すること。 Anonymous Communication designs such as Tor build their security on distributed trust over many volunteers running relays in diverse global locations. In practice, this distribution leads to a heterogeneous network in which many versions of the Tor software co-exist, each with differing sets of protocol features. Because of this heterogeneity, Tor developers employ forward-compatible protocol design as a strategy to maintain network extensibility. This strategy aims to guarantee that different versions of the Tor software interact without unrecoverable errors. In this work, we cast protocol tolerance that is enabled by forward-compatible protocol considerations as a fundamental security issue. We argue that, while being beneficial for the developers, protocol tolerance has resulted in a number of strong attacks against Tor in the past fifteen years. To address this issue, we propose Flexible Anonymous Network (FAN), a new software architecture for volunteer-based distributed networks that shifts the dependence away from protocol tolerance without losing the ability for developers to ensure the continuous evolution of their software. We i) instantiate an implementation, ii) evaluate its overheads and, iii) experiment with several of FAN's benefits to defend against a severe attack still applicable to Tor today.	公開日:2024-09-23 翻訳日:2024-11-09 15:46:48
# 再帰的変分量子コンパイル Recursive Variational Quantum Compiling ( http://arxiv.org/abs/2203.08514v2 ) ライセンス: Link先を確認	Stian Bilek, Kristian Wold,	(参考訳) 変分量子コンパイル(VQC)アルゴリズムは、深い量子回路を浅いパラメータ化アンサーゼで近似することを目的としており、NISQハードウェアにより適している。本稿では、再帰的変動量子コンパイル(RVQC)アルゴリズムと呼ばれるVQCの変種を提案する。既存のVQCアルゴリズムでは、コンパイル中に全回路をコヒーレントに実行する必要がある。ノイズの影響下では、十分に深いターゲット回路は通常のVQCではコンパイルが不可能となる。コンパイルはしばしば勾配に基づく量子古典的アプローチによって達成されるので、量子ノイズは最適化時にノイズの勾配として表され、収束が困難になる。一方、RVQCは、まずそれを$N$の短いサブ回路に分割し、一度に1つのサブ回路を評価することで、回路をコンパイルすることができる。その結果、RVQCを実装するために必要な回路深さは、ターゲット回路の深さではなく、サブ回路の深さに依存する。高い$N$を選択することで、個々のコンパイルを成功させるのに十分な浅いサブ回路が確保できる。 RVQCはIBM SantiagoデバイスのノイズモデルでVQCと比較され、ランダムに生成された5ビット回路を約1000深さでコンパイルすることを目的としていた。 VQCは500回の最適化で収束できなかった。一方、RVQCは、ターゲット回路を$N = 5$に分割する際に、合計500回のイテレーションで0.90 \pm 0.05$の忠実度に収束することができた。 Variational quantum compiling (VQC) algorithms aim to approximate deep quantum circuits with shallow parameterized ansatzes, making them more suitable for NISQ hardware. In this article a variant of VQC named the recursive variational quantum compiling (RVQC) algorithm is proposed. Existing VQC algorithms typically require coherently executing the full circuit during compilation. Under the influence of noise, sufficiently deep target circuits make compiling unfeasible using ordinary VQC. Since the compiling is often accomplished using a gradient-based quantum-classical approach, the quantum noise manifest as a noisy gradient during optimization, making convergence hard to obtain. On the other hand, RVQC can compile a circuit by first dividing it into $N$ shorter sub-circuits, then evaluate one sub-circuit at a time. As a result, the circuit depth required to implement RVQC is not dependent on the depth of the target circuit, but on the depth of the sub-circuits. Choosing a high enough $N$ thus ensures sufficiently shallow sub-circuit which can be successfully compiled individually. RVQC was compared with VQC on a noise model of the IBM Santiago device with the goal of compiling several randomly generated five-qubit circuits of approximately depth 1000. It was shown that VQC was not able to converge within 500 iterations of optimization. On the other hand, RVQC was able to converge to a fidelity of $0.90 \pm 0.05$ within a total of 500 iterations when splitting the target circuits into $N = 5$ parts.	公開日:2024-09-24 翻訳日:2024-11-09 15:46:48
# 汎用エージェント研究のためのサンドボックス環境 The Sandbox Environment for Generalizable Agent Research (SEGAR) ( http://arxiv.org/abs/2203.10351v2 ) ライセンス: Link先を確認	R Devon Hjelm, Bogdan Mazoure, Florian Golemo, Samira Ebrahimi Kahou, Pedro Braga, Felipe Frujeri, Mihai Jalobeanu, Andrey Kolobov,	(参考訳) 対話型環境における逐次意思決定タスクの一般化に関する研究の課題は、明らかに進歩を示すベンチマークを設計することである。目立った道のりはあったが、現在のベンチマークでは、適切な露出や根底にある要因の直感的な制御を提供しておらず、簡単に実装でき、カスタマイズ可能で、拡張可能でもなく、計算に費用がかかる。汎用エージェント研究のためのサンドボックス環境(SEGAR)を構築した。 SEGARは、一般化目的をタスク分布を指定することで容易に設計できるので、RLにおける一般化研究の容易さと説明責任を向上させる。本稿では、SEGARの概要と、SEGARがこれらの目標にどのように貢献するか、および、SEGARが答えられるいくつかの研究課題を実証する実験を紹介する。 A broad challenge of research on generalization for sequential decision-making tasks in interactive environments is designing benchmarks that clearly landmark progress. While there has been notable headway, current benchmarks either do not provide suitable exposure nor intuitive control of the underlying factors, are not easy-to-implement, customizable, or extensible, or are computationally expensive to run. We built the Sandbox Environment for Generalizable Agent Research (SEGAR) with all of these things in mind. SEGAR improves the ease and accountability of generalization research in RL, as generalization objectives can be easy designed by specifying task distributions, which in turns allows the researcher to measure the nature of the generalization objective. We present an overview of SEGAR and how it contributes to these goals, as well as experiments that demonstrate a few types of research questions SEGAR can help answer.	公開日:2024-09-26 翻訳日:2024-11-09 15:46:48
# テレポーテーションによる量子ルーティング Quantum Routing with Teleportation ( http://arxiv.org/abs/2204.04185v2 ) ライセンス: Link先を確認	Dhruv Devulapalli, Eddie Schoute, Aniruddha Bapat, Andrew M. Childs, Alexey V. Gorshkov,	(参考訳) 量子系における相互作用制約下での量子ビットの任意の置換を任意に行うことで、高速な局所演算と古典的通信(LOCC)が可能な問題について検討する。特に,スワップベースおよびより一般的なユニタリルーティング手法による高速化の例として,絡み合いを分散し,LOCCを用いて量子テレポーテーションを行う例を示す。さらに,通信通信がスワップベースのルーティングよりも最悪のルーティング時間で対数的に高速化する相互作用グラフの例を述べる。また、量子テレポーテーションによって得られるスピードアップの限界(O(\sqrt{N \log N})$上界)について検討し、グラフの一般的なクラスに対してより厳密な境界を与える。 We study the problem of implementing arbitrary permutations of qubits under interaction constraints in quantum systems that allow for arbitrarily fast local operations and classical communication (LOCC). In particular, we show examples of speedups over swap-based and more general unitary routing methods by distributing entanglement and using LOCC to perform quantum teleportation. We further describe an example of an interaction graph for which teleportation gives a logarithmic speedup in the worst-case routing time over swap-based routing. We also study limits on the speedup afforded by quantum teleportation - showing an $O(\sqrt{N \log N})$ upper bound on the separation in routing time for any interaction graph - and give tighter bounds for some common classes of graphs.	公開日:2024-09-23 翻訳日:2024-11-09 15:46:48
# 生物学的時系列データから確率力学方程式を発見する Discovering stochastic dynamical equations from biological time series data ( http://arxiv.org/abs/2205.02645v6 ) ライセンス: Link先を確認	Arshed Nabeel, Ashwin Karichannavar, Shuaib Palathingal, Jitesh Jhawar, David B. Brückner, Danny Raj M., Vishwesha Guttal,	(参考訳) 理論的研究により、確率性は反直観的な方法で生態系の力学に影響を与えることが示されている。しかし、個体群や生態系の動態を規定する方程式を知らずに、実際のデータセットにおける確率性の役割を確かめることは困難である。したがって、データセットから支配確率方程式を推定する逆問題は重要である。本稿では,状態変数の時系列データを入力とし,確率微分方程式を出力する方程式探索手法を提案する。確率計算からの従来のアプローチと方程式発見手法を組み合わせることでこれを実現できる。いくつかの応用を通して,本手法の一般化を実証する。まず、基本的に異なる支配方程式を持つ様々な確率モデルを意図的に選択するが、ほぼ同一の定常分布を生成する。時系列データのみの解析から,正しい基礎となる方程式を復元し,その安定性を正確に推定できることが示される。我々は,魚の学習と単一細胞移動という,時空間スケールとダイナミクスの異なる2つの実世界のデータセット上で,我々の手法を実証する。本手法の様々な限界と潜在的な落とし穴と診断方法による克服方法について述べる。最後に、PyDaDDy(Python Library for Data Driven Dynamics)というパッケージを通じて、オープンソースコードを提供しています。 Theoretical studies have shown that stochasticity can affect the dynamics of ecosystems in counter-intuitive ways. However, without knowing the equations governing the dynamics of populations or ecosystems, it is difficult to ascertain the role of stochasticity in real datasets. Therefore, the inverse problem of inferring the governing stochastic equations from datasets is important. Here, we present an equation discovery methodology that takes time series data of state variables as input and outputs a stochastic differential equation. We achieve this by combining traditional approaches from stochastic calculus with the equation-discovery techniques. We demonstrate the generality of the method via several applications. First, we deliberately choose various stochastic models with fundamentally different governing equations; yet they produce nearly identical steady-state distributions. We show that we can recover the correct underlying equations, and thus infer the structure of their stability, accurately from the analysis of time series data alone. We demonstrate our method on two real-world datasets -- fish schooling and single-cell migration -- which have vastly different spatiotemporal scales and dynamics. We illustrate various limitations and potential pitfalls of the method and how to overcome them via diagnostic measures. Finally, we provide our open-source codes via a package named PyDaDDy (Python library for Data Driven Dynamics).	公開日:2024-09-22 翻訳日:2024-11-09 15:46:48
# DQNは学ぶか? Does DQN Learn? ( http://arxiv.org/abs/2205.13617v4 ) ライセンス: Link先を確認	Aditya Gopalan, Gugan Thoppe,	(参考訳) 強化学習法が有用であるためには、その限界で見積もるポリシーは、少なくとも平均的には、初期推定よりも優れている必要がある。本研究では,全ての可能な状態や動作を無限に見ることができても,広く使用されている深層Q-Network (DQN) が,この基本的な基準を満たさないことを示す(この条件により,表型Q-ラーニングの最適Q-値への収束が保証される)。私たちの作品のハイライトは以下のとおりです。第一に、DQNは一般的に、初期よりも悪い政策を生み出す非自明な確率を持つことを示す。第二に、線形DQNの文脈でこの振る舞いを理論的に説明し、ニューラルネットワークを線形関数近似に置き換えるが、DQNの他の重要な概念、例えば経験的リプレイ、ターゲットネットワーク、および$\epsilon$-greedy探索を保持する。我々の主な結果は、線形DQNの尾の挙動は、決定論的微分包含の不変集合、つまり微分方程式の集合値一般化によって支配されることである。特に、これらの不変集合は局所的最適ポリシーと整合する必要はないことを示し、DQNの準最適ポリシーへの収束や政策振動といった病理学的挙動を説明する。また、制限ポリシーが常に最悪であるシナリオも提供します。我々の研究は、関数近似と$\epsilon$-greedyの探索によるQ-ラーニングの振る舞いの理解における長年のギャップに対処する。 For a reinforcement learning method to be useful, the policy it estimates in the limit must be superior to the initial guess, at least on average. In this work, we show that the widely used Deep Q-Network (DQN) fails to meet even this basic criterion, even when it gets to see all possible states and actions infinitely often (a condition that ensures tabular Q-learning's convergence to the optimal Q-value). Our work's key highlights are as follows. First, we numerically show that DQN generally has a non-trivial probability of producing a policy worse than the initial one. Second, we give a theoretical explanation for this behavior in the context of linear DQN, wherein we replace the neural network with a linear function approximation but retain DQN's other key ideas, such as experience replay, target network, and $\epsilon$-greedy exploration. Our main result is that the tail behaviors of linear DQN are governed by invariant sets of a deterministic differential inclusion, a set-valued generalization of a differential equation. Notably, we show that these invariant sets need not align with locally optimal policies, thus explaining DQN's pathological behaviors, such as convergence to sub-optimal policies and policy oscillation. We also provide a scenario where the limiting policy is always the worst. Our work addresses a longstanding gap in understanding the behaviors of Q-learning with function approximation and $\epsilon$-greedy exploration.	公開日:2024-09-21 翻訳日:2024-11-09 15:46:48
# GraphMLP: 3Dヒューマンポース推定のためのグラフMLPライクなアーキテクチャ GraphMLP: A Graph MLP-Like Architecture for 3D Human Pose Estimation ( http://arxiv.org/abs/2206.06420v5 ) ライセンス: Link先を確認	Wenhao Li, Mengyuan Liu, Hong Liu, Tianyu Guo, Ti Wang, Hao Tang, Nicu Sebe,	(参考訳) 現代の多層パーセプトロン(MLP)モデルは、自己注意なしで視覚表現を学習する際の競合的な結果を示している。しかし、既存のMLPモデルは、局所的な詳細を捉えるのが得意ではなく、人体構成に関する事前の知識が欠けているため、骨格表現学習のモデリング能力は制限されている。これらの課題に対処するため,我々は,3次元ポーズ推定のためのグローバル・ローカル・グラフィック統一アーキテクチャにおいて,MPPとGCNを組み合わせたグラフ強化型MLPアーキテクチャーGraphMLPを提案する。 GraphMLPは、人体のグラフ構造をMLPモデルに組み込んで、3D人間のポーズのドメイン固有の要求を満たすとともに、局所的およびグローバルな空間的相互作用を可能にする。さらに,GraphMLPをビデオ領域に柔軟かつ効率的に拡張し,複雑な時間的ダイナミクスを,列長が無視できる計算コストゲインの簡単な方法で効果的にモデル化できることを提案する。我々の知る限りでは、これは単一のフレームとビデオシーケンスで3次元のポーズ推定を行う最初のMLPライクなアーキテクチャである。大規模な実験により、提案したGraphMLPは、Human3.6MとMPI-INF-3DHPの2つのデータセットで最先端のパフォーマンスを達成することが示された。コードとモデルはhttps://github.com/Vegetebird/GraphMLP.comで公開されている。 Modern multi-layer perceptron (MLP) models have shown competitive results in learning visual representations without self-attention. However, existing MLP models are not good at capturing local details and lack prior knowledge of human body configurations, which limits their modeling power for skeletal representation learning. To address these issues, we propose a simple yet effective graph-reinforced MLP-Like architecture, named GraphMLP, that combines MLPs and graph convolutional networks (GCNs) in a global-local-graphical unified architecture for 3D human pose estimation. GraphMLP incorporates the graph structure of human bodies into an MLP model to meet the domain-specific demand of the 3D human pose, while allowing for both local and global spatial interactions. Furthermore, we propose to flexibly and efficiently extend the GraphMLP to the video domain and show that complex temporal dynamics can be effectively modeled in a simple way with negligible computational cost gains in the sequence length. To the best of our knowledge, this is the first MLP-Like architecture for 3D human pose estimation in a single frame and a video sequence. Extensive experiments show that the proposed GraphMLP achieves state-of-the-art performance on two datasets, i.e., Human3.6M and MPI-INF-3DHP. Code and models are available at https://github.com/Vegetebird/GraphMLP.	公開日:2024-09-21 翻訳日:2024-11-09 15:46:48
# 人間の目に触発されたリカレントニューラルネットワークは、敵の騒音に対してよりロバストである Human Eyes Inspired Recurrent Neural Networks are More Robust Against Adversarial Noises ( http://arxiv.org/abs/2206.07282v2 ) ライセンス: Link先を確認	Minkyu Choi, Yizhen Zhang, Kuan Han, Xiaokai Wang, Zhongming Liu,	(参考訳) 人間は、静かな物体に焦点をあて、自明な詳細を無視して、視覚的な環境を積極的に観察する。しかし、畳み込みニューラルネットワーク(CNN)に基づくコンピュータビジョンモデルは、単一のフィードフォワードパスを通じて、視覚的な入力を一度に分析することが多い。本研究では、人間の脳にインスパイアされたデュアルストリーム視覚モデルを設計した。このモデルは網膜のような入力層を特徴とし、次の焦点(固定点)を決定する2つのストリームと、固定点を取り巻く視覚を解釈する2つのストリームを含む。このモデルは、画像認識に基づいて、様々な部分に焦点を当てる度に、一連の固定を通して画像を検査し、画像の表現を段階的に構築する。このモデルを,物体認識,視線行動,対向強靭性の観点から評価した。以上の結果から,本モデルは人間の注意を模倣する訓練を受けずに,人間と類似した形で観察し,網膜サンプリングや反復処理による敵の攻撃に対する堅牢性を高めることが可能であることが示唆された。特に、このモデルは、フィードフォワードのみのモデルとは切り離して、よりよく見ることによって、知覚上のエラーを修正することができる。結論として, 網膜サンプリング, 眼球運動, リカレントダイナミクスの相互作用は, 人間の視覚的探索や推論において重要である。 Humans actively observe the visual surroundings by focusing on salient objects and ignoring trivial details. However, computer vision models based on convolutional neural networks (CNN) often analyze visual input all at once through a single feed-forward pass. In this study, we designed a dual-stream vision model inspired by the human brain. This model features retina-like input layers and includes two streams: one determining the next point of focus (the fixation), while the other interprets the visuals surrounding the fixation. Trained on image recognition, this model examines an image through a sequence of fixations, each time focusing on different parts, thereby progressively building a representation of the image. We evaluated this model against various benchmarks in terms of object recognition, gaze behavior and adversarial robustness. Our findings suggest that the model can attend and gaze in ways similar to humans without being explicitly trained to mimic human attention, and that the model can enhance robustness against adversarial attacks due to its retinal sampling and recurrent processing. In particular, the model can correct its perceptual errors by taking more glances, setting itself apart from all feed-forward-only models. In conclusion, the interactions of retinal sampling, eye movement, and recurrent dynamics are important to human-like visual exploration and inference.	公開日:2024-09-26 翻訳日:2024-11-09 15:46:48
# トークンによる支払いシステム Token-Based Payment Systems ( http://arxiv.org/abs/2207.07530v2 ) ライセンス: Link先を確認	Geoffrey Goodell,	(参考訳) 本稿では,デジタル決済システムにおけるトークンと分散台帳の役割について考察する。本稿では,トークンを用いたデジタル決済システムの簡単な分類法を提案し,分散台帳技術がデジタル決済システム全般をサポートする方法の異なるモデルに対処する。我々は、消費者プライバシ、トークン発行、システムオペレーターに対する説明責任の観点から理解したデジタル決済システムの健全な機能に関するガイダンスを提供する。 In this article, we consider the roles of tokens and distributed ledgers in digital payment systems. We present a brief taxonomy of digital payment systems that use tokens, and we address the different models for how distributed ledger technology can support digital payment systems in general. We offer guidance on the salient features of digital payment systems, which we comprehend in terms of consumer privacy, token issuance, and accountability for system operators.	公開日:2024-09-21 翻訳日:2024-11-09 15:46:48
# マルチロボットコーディネーションのための分散微分可能な動的ゲーム Distributed Differentiable Dynamic Game for Multi-robot Coordination ( http://arxiv.org/abs/2207.08892v4 ) ライセンス: Link先を確認	Yizhi Zhou, Wanxin Jin, Xuan Wang,	(参考訳) 本稿では,マルチロボット協調における前方および逆問題の効率よく解決できる分散微分可能動的ゲーム(D3G)フレームワークを開発する。我々は,ロボットの動作が,他者の行動にも依存する自身のダイナミクスと目的によって決定される動的ゲームとして,マルチロボット協調を定式化する。前方問題では、D3Gは分散シューティングベースのナッシュソルバを開発することにより、全てのロボットが協調してゲームのナッシュ平衡を分散的に求めることを可能にする。ロボットが与えられた協調デモを模倣する目的(およびダイナミクス)パラメータを見つけ(学習)する逆問題において、D3Gは微分ポントリャーギンの最大原理に基づく微分解法を提案し、各ロボットがパラメータを分散的かつ協調的に更新できるようにする。タスク構成が異なる2種類のロボットを用いてD3Gをシミュレーションでテストする。その結果, 従来の手法と比較して, 前方および逆問題の解法におけるD3Gの有効性が示された。 This paper develops a Distributed Differentiable Dynamic Game (D3G) framework, which can efficiently solve the forward and inverse problems in multi-robot coordination. We formulate multi-robot coordination as a dynamic game, where the behavior of a robot is dictated by its own dynamics and objective that also depends on others' behavior. In the forward problem, D3G enables all robots collaboratively to seek the Nash equilibrium of the game in a distributed manner, by developing a distributed shooting-based Nash solver. In the inverse problem, where each robot aims to find (learn) its objective (and dynamics) parameters to mimic given coordination demonstrations, D3G proposes a differentiation solver based on Differential Pontryagin's Maximum Principle, which allows each robot to update its parameters in a distributed and coordinated manner. We test the D3G in simulation with two types of robots given different task configurations. The results demonstrate the effectiveness of D3G for solving both forward and inverse problems in comparison with existing methods.	公開日:2024-09-23 翻訳日:2024-11-09 15:46:48
# ベリー-ディポールの遷移における外在的および内在的非線形ホール効果 Extrinsic and Intrinsic Nonlinear Hall Effects across Berry-Dipole Transitions ( http://arxiv.org/abs/2208.02972v2 ) ライセンス: Link先を確認	Zheng-Yang Zhuang, Zhongbo Yan,	(参考訳) 3次元ホップ絶縁体(3-dimensional Hopf insulator)は、トポロジカル位相のクラスである。異なるホップ不変量を持つ2つの回転不変ホップ絶縁体相を分離する臨界点は、通常のディラック型やワイル型臨界点とは大きく異なり、量子化されたベリー双極子によって特徴付けられる。このようなベリー-双極子遷移に近く、弱ドーピング状態における外在的および内在的非線形ホール伝導率テンソルは、ドーピングレベルとバルクエネルギーギャップの比の2つの普遍関数によって特徴づけられ、遷移のホップ不変量の変化に直接比例する。我々の研究は、非線形ホール効果はベリー-双極子遷移全体にわたって一般的な量子化挙動を示し、非線形ホール効果とホップ不変量との対応性を確立することを示唆している。 Three-dimensional Hopf insulators are a class of topological phases beyond the tenfold-way classification. The critical point separating two rotation-invariant Hopf insulator phases with distinct Hopf invariants is quite different from the usual Dirac-type or Weyl-type critical points and uniquely characterized by a quantized Berry dipole. Close to such Berry-dipole transitions, we find that the extrinsic and intrinsic nonlinear Hall conductivity tensors in the weakly doped regime are characterized by two universal functions of the ratio between doping level and bulk energy gap, and are directly proportional to the change in Hopf invariant across the transition. Our work suggests that the nonlinear Hall effects display a general-sense quantized behavior across Berry-dipole transitions, establishing a correspondence between nonlinear Hall effects and Hopf invariant.	公開日:2024-09-27 翻訳日:2024-11-09 15:46:48
# ディープラーニングのためのラデマッハ複雑度に基づく一般化境界について On Rademacher Complexity-based Generalization Bounds for Deep Learning ( http://arxiv.org/abs/2208.04284v3 ) ライセンス: Link先を確認	Lan V. Truong,	(参考訳) Rademacherの複雑性に基づくアプローチは、少数の画像のクラスを分類するために、畳み込みニューラルネットワーク(CNN)上の非空の一般化バウンダリを生成することができる。一般リプシッツ活性化関数に対する関数空間とCNNの間の高次元写像のための新しいタラグランド縮約補題の開発は重要な技術的貢献である。以上の結果から,ReLU,Leaky ReLU,Parametric Rectifier Linear Unit,Sigmoid,Tanhなどの特別なアクティベーション機能を持つCNNのネットワーク長に依存しないことがわかった。 We show that the Rademacher complexity-based approach can generate non-vacuous generalisation bounds on Convolutional Neural Networks (CNNs) for classifying a small number of classes of images. The development of new Talagrand's contraction lemmas for high-dimensional mappings between function spaces and CNNs for general Lipschitz activation functions is a key technical contribution. Our results show that the Rademacher complexity does not depend on the network length for CNNs with some special types of activation functions such as ReLU, Leaky ReLU, Parametric Rectifier Linear Unit, Sigmoid, and Tanh.	公開日:2024-09-27 翻訳日:2024-11-09 15:46:48
# 量子マルチパラメータ推定のためのギャップパーシステンス定理 The gap persistence theorem for quantum multiparameter estimation ( http://arxiv.org/abs/2208.07386v3 ) ライセンス: Link先を確認	Lorcán O. Conlon, Jun Suzuki, Ping Koy Lam, Syed M. Assad,	(参考訳) 量子距離論の1つの重要な側面は、複数のパラメータの同時推定によってのみ明らかである。対称対数微分 Cram\'er-Rao bound (SLDCRB) は、各パラメータの可換性を推定するための最適な測定値である場合、達成可能な精度を与える。最適測定が通勤しない場合、SLDCRBは必ずしも到達できない。この点において、ホレボ・クラム・ラオ境界(HCRB)は基本的役割を担い、量子状態の無限に多くのコピーを同時に測定できるとき、最終的な到達可能な精度を提供する。実用的な目的のために、長岡クラム・ラオ境界(NCRB)はより関係があり、個別に量子状態を測定することに制限される。これら3つの境界の間の相互作用は、プローブ状態の有限コピーの集合的測定によって、究極の気象学的精度がいかに早くアプローチできるかを定めている。まず2つのパラメータ推定を考慮し、HCRBがプローブ状態の1つのコピーで飽和できない場合、プローブ状態の有限個のコピーに対して飽和できないことを証明した。そこで本研究では, HCRB を物理的に動機づけたいくつかの問題に対して飽和させることは不可能であることを示す。パラメータの数を推定するためには,SLDCRBの到達可能性に必要かつ十分な条件を分離可能な測定で提供する。さらに、SLDCRBがプローブ状態の1つのコピーで到達できない場合、プローブ状態の有限個のコピーの集合的な測定では到達できないことを示す。これらの結果は、プローブ状態の有限個のコピーに対して、SLDCRBが到達可能であるために必要かつ十分な条件を提供する。これは、最近[P. Horodecki et al, Phys. Rev. X Quantum 3, 010101 (2022)] によって強調された5つの問題の1つを顕著に一般化する。 One key aspect of quantum metrology, measurement incompatibility, is evident only through the simultaneous estimation of multiple parameters. The symmetric logarithmic derivative Cram\'er-Rao bound (SLDCRB), gives the attainable precision, if the optimal measurements for estimating each individual parameter commute. When the optimal measurements do not commute, the SLDCRB is not necessarily attainable. In this regard, the Holevo Cram\'er-Rao bound (HCRB) plays a fundamental role, providing the ultimate attainable precisions when one allows simultaneous measurements on infinitely many copies of a quantum state. For practical purposes, the Nagaoka Cram\'er-Rao bound (NCRB) is more relevant, applying when restricted to measuring quantum states individually. The interplay between these three bounds dictates how rapidly the ultimate metrological precisions can be approached through collective measurements on finite copies of the probe state. We first consider two parameter estimation and prove that if the HCRB cannot be saturated with a single copy of the probe state, then it cannot be saturated for any finite number of copies of the probe state. With this, we show that it is impossible to saturate the HCRB for several physically motivated problems. For estimating any number of parameters, we provide necessary and sufficient conditions for the attainability of the SLDCRB with separable measurements. We further prove that if the SLDCRB cannot be reached with a single copy of the probe state, it cannot be reached with collective measurements on any finite number of copies of the probe state. These results together provide necessary and sufficient conditions for the attainability of the SLDCRB for any finite number of copies of the probe state. This solves a significant generalisation of one of the five problems recently highlighted by [P.Horodecki et al, Phys. Rev. X Quantum 3, 010101 (2022)].	公開日:2024-09-25 翻訳日:2024-11-09 15:46:48
# 数個の熱量子の分割による絡み合い成長 Entanglement growth via splitting of a few thermal quanta ( http://arxiv.org/abs/2208.07816v2 ) ライセンス: Link先を確認	Pradip Laha, Darren W. Moore, Radim Filip,	(参考訳) 量子分割は、アインシュタイン=ポドルスキー=ローゼン状態によって実証されたガウスの絡み合いの本質的な生成物であり、明らかに最も一般的に生じる絡み合いの形式である。一般に、これは高コヒーレントで低ノイズの外部駆動を持つ非線形過程の強い励起から生じる。対照的に、閉じ込められたイオンと超伝導回路における効率的な三線型過程を含む最近の実験は、数個の熱量子の分裂をテストするための相補的な可能性を開いた。このような小さな熱エネルギーによって刺激され、強い縮退したトリリニアカップリングは、蒸留可能な4次スクイージングの3dB以上で検出できる大量の非古典性を生成する。定常絡み合いは、トリリニアカップリングと平行に存在する第3モードへの頻繁なパッシブ線形カップリングによって生成される。この新しいエンタングルメントは、ガウスの近似の外にあるが、平均的な熱量子数によって驚くほど増大し、ガウスのエンタングルメントに欠落する。蒸留性スクイーズを用いて、非線形ボソニック系の新しい絡み合い機構に光を当てた。 Quanta splitting is an essential generator of Gaussian entanglement, exemplified by Einstein-Podolsky-Rosen states and apparently the most commonly occurring form of entanglement. In general, it results from the strong pumping of a nonlinear process with a highly coherent and low-noise external drive. In contrast, recent experiments involving efficient trilinear processes in trapped ions and superconducting circuits have opened the complementary possibility to test the splitting of a few thermal quanta. Stimulated by such small thermal energy, a strong degenerate trilinear coupling generates large amounts of nonclassicality, detectable by more than 3 dB of distillable quadrature squeezing. Substantial entanglement can be generated via frequent passive linear coupling to a third mode present in parallel with the trilinear coupling. This new form of entanglement, outside any Gaussian approximation, surprisingly grows with the mean number of split thermal quanta; a quality absent from Gaussian entanglement. Using distillable squeezing we shed light on this new entanglement mechanism for nonlinear bosonic systems.	公開日:2024-09-24 翻訳日:2024-11-09 15:46:48
# IDP-PGFE:物理誘導特徴抽出に基づく解釈可能な破壊予測器 IDP-PGFE: An Interpretable Disruption Predictor based on Physics-Guided Feature Extraction ( http://arxiv.org/abs/2208.13197v2 ) ライセンス: Link先を確認	Chengshuo Shen, Wei Zheng, Yonghua Ding, Xinkun Ai, Fengming Xue, Yu Zhong, Nengchao Wang, Li Gao, Zhipeng Chen, Zhoujun Yang, Zhongyong Chen, Yuan Pan, J-TEXT team,	(参考訳) ディスラプション予測は、特に機械学習(ML)ベースの手法において、近年急速に進歩している。予測器が特定の予測を行う理由を理解することは、将来のトカマク破壊予測器の予測精度と同じくらい重要である。ほとんどの破壊予測器の目的は、精度またはクロスマシン能力である。しかし、ディスラプション予測モデルが解釈可能であれば、特定のサンプルがディスラプション前駆体として分類される理由を知ることができる。これにより、入ってくる破壊のタイプを判断し、破壊のメカニズムについて洞察することが可能になる。本稿では,J-TEXT上での物理誘導特徴抽出(IDP-PGFE)に基づく解釈破壊予測器を設計する。物理誘導された特徴を抽出することにより、モデルの予測性能を効果的に向上する。解釈結果の妥当性を保証するためには,高性能モデルが必要である。 IDP-PGFEの解釈可能性の研究は、J-TEXT破壊の理解を提供し、一般に既存の破壊の理解と一致している。 IDP-PGFEは, J-TEXTにおける密度限界実験に向けて, 連続的に密度を増大させることにより, 破壊に応用されている。 PGFEの特徴の時間的進化は、ECRHの応用によって放射線による破壊が引き起こされ、破壊時の密度が低下することを示す。 RMPの適用は確かにJ-TEXTの密度限界を上昇させる。この解釈可能性の研究は、RMPがMHD不安定性だけでなく、密度限界破壊を遅らせる放射プロファイルにも影響を及ぼす密度限界破壊の物理的メカニズムの直観を導く。 Disruption prediction has made rapid progress in recent years, especially in machine learning (ML)-based methods. Understanding why a predictor makes a certain prediction can be as crucial as the prediction's accuracy for future tokamak disruption predictors. The purpose of most disruption predictors is accuracy or cross-machine capability. However, if a disruption prediction model can be interpreted, it can tell why certain samples are classified as disruption precursors. This allows us to tell the types of incoming disruption and gives us insight into the mechanism of disruption. This paper designs a disruption predictor called Interpretable Disruption Predictor based On Physics-guided feature extraction (IDP-PGFE) on J-TEXT. The prediction performance of the model is effectively improved by extracting physics-guided features. A high-performance model is required to ensure the validity of the interpretation results. The interpretability study of IDP-PGFE provides an understanding of J-TEXT disruption and is generally consistent with existing comprehension of disruption. IDP-PGFE has been applied to the disruption due to continuously increasing density towards density limit experiments on J-TEXT. The time evolution of the PGFE features contribution demonstrates that the application of ECRH triggers radiation-caused disruption, which lowers the density at disruption. While the application of RMP indeed raises the density limit in J-TEXT. The interpretability study guides intuition on the physical mechanisms of density limit disruption that RMPs affect not only the MHD instabilities but also the radiation profile, which delays density limit disruption.	公開日:2024-09-26 翻訳日:2024-11-09 15:46:48
# Mine yOur owN anatomy: Revising Medical Image Segmentation with Extremely Limited Labels (特集バイオサイバネティックスとバイオサイバネティックス) Mine yOur owN Anatomy: Revisiting Medical Image Segmentation with Extremely Limited Labels ( http://arxiv.org/abs/2209.13476v6 ) ライセンス: Link先を確認	Chenyu You, Weicheng Dai, Fenglin Liu, Yifei Min, Nicha C. Dvornek, Xiaoxiao Li, David A. Clifton, Lawrence Staib, James S. Duncan,	(参考訳) 近年のコントラスト学習の研究は, 医療画像セグメンテーションの文脈において, ラベルの少ないことのみを生かして, 優れた成果を上げている。既存の方法は、主にインスタンスの識別と不変マッピングに焦点を当てている。 1) 尾性: 医療画像データは通常、暗黙の長い尾のクラス分布に従う。したがって、訓練ですべてのピクセルを盲目的に活用することは、データの不均衡を招き、パフォーマンスを悪化させる。(2)一貫性:セグメント化モデルが、異なる解剖学的特徴のクラス内変化のために有意義で一貫性のある解剖学的特徴を学習したかどうか、(3)多様性:データセット全体のスライス内相関は、著しく低い注意を払っている。これは、データセット自体を戦略的に利用し、異なる解剖学的視点から類似しているが異なるサンプルを発見するための、原則化されたアプローチを求める動機である。本稿では,Mine yOur owN Anatomy (MONA) と呼ばれる,半教師付き2次元医用画像セグメンテーションフレームワークを紹介する。まず、先行研究では、すべてのピクセルがモデルトレーニングに等しく重要であると論じており、これらだけでは、主に監視信号が欠如していることから、意味のある解剖学的特徴を定義することは不可能である、と実証的に観察している。より強力なデータ拡張と最も近い隣人を使って、不変性を学ぶための2つの簡単なソリューションを示します。第2に,医療画像の解剖学的特徴の集合体への分解を教師なしで行うことをモデルに促す目的の集合を構築した。最後に、我々は実験的かつ理論的に、異なるラベル付き設定で3つのベンチマークデータセットに対してMONAの有効性を実証し、異なるラベル付き半教師付き設定で新しい最先端を実現する。 Recent studies on contrastive learning have achieved remarkable performance solely by leveraging few labels in the context of medical image segmentation. Existing methods mainly focus on instance discrimination and invariant mapping. However, they face three common pitfalls: (1) tailness: medical image data usually follows an implicit long-tail class distribution. Blindly leveraging all pixels in training hence can lead to the data imbalance issues, and cause deteriorated performance; (2) consistency: it remains unclear whether a segmentation model has learned meaningful and yet consistent anatomical features due to the intra-class variations between different anatomical features; and (3) diversity: the intra-slice correlations within the entire dataset have received significantly less attention. This motivates us to seek a principled approach for strategically making use of the dataset itself to discover similar yet distinct samples from different anatomical views. In this paper, we introduce a novel semi-supervised 2D medical image segmentation framework termed Mine yOur owN Anatomy (MONA), and make three contributions. First, prior work argues that every pixel equally matters to the model training; we observe empirically that this alone is unlikely to define meaningful anatomical features, mainly due to lacking the supervision signal. We show two simple solutions towards learning invariances - through the use of stronger data augmentations and nearest neighbors. Second, we construct a set of objectives that encourage the model to be capable of decomposing medical images into a collection of anatomical features in an unsupervised manner. Lastly, we both empirically and theoretically, demonstrate the efficacy of our MONA on three benchmark datasets with different labeled settings, achieving new state-of-the-art under different labeled semi-supervised settings.	公開日:2024-09-22 翻訳日:2024-11-09 15:46:48
# FIRE:エッジコンピューティングマイグレーションのための障害適応型強化学習フレームワーク FIRE: A Failure-Adaptive Reinforcement Learning Framework for Edge Computing Migrations ( http://arxiv.org/abs/2209.14399v3 ) ライセンス: Link先を確認	Marie Siew, Shikhar Sharma, Zekai Li, Kun Guo, Chao Xu, Tania Lorido-Botran, Tony Q. S. Quek, Carlee Joe-Wong,	(参考訳) エッジコンピューティングでは、ユーザのモビリティのために、ユーザのサービスプロファイルが移行される。強化学習(RL)フレームワークは、しばしばシミュレーションデータに基づいて訓練される。しかし、既存のRLフレームワークは時折サーバの障害を見落としており、これは、自律運転やリアルタイム障害検出のような遅延に敏感なアプリケーションに影響を与えている。それでも、過去のトレーニングデータで適切に表現されていないこれらの失敗(まれな出来事)は、データ駆動RLアルゴリズムに挑戦する。実世界のトレーニング用アプリケーションにおいて、故障頻度を調整することは現実的ではないため、エッジコンピューティングのディジタルツイン環境でRLポリシーをトレーニングすることで、まれな事象に適応するフレームワークであるFIREを導入する。 ImREは重要なサンプリングに基づくQ-ラーニングアルゴリズムであり、希少事象をその値関数への影響に比例してサンプリングする。 FIREは、個々のサービスプロファイルと共有サービスのプロファイルにまたがる遅延、マイグレーション、障害、バックアップの配置コストを考慮に入れている。我々はImREの有界性と最適性への収束性を証明する。次に、拡張性を高めるために、新しいQ-ラーニング(ImDQL)とアクタ評論家(ImACRE)バージョンを導入します。リスクトレランスの異なるユーザに対応するために、当社のフレームワークを拡張しています。トレース駆動実験により,障害発生時のバニラRLやグリーディベースラインと比較して,FIREがコストを削減できることが判明した。 In edge computing, users' service profiles are migrated due to user mobility. Reinforcement learning (RL) frameworks have been proposed to do so, often trained on simulated data. However, existing RL frameworks overlook occasional server failures, which although rare, impact latency-sensitive applications like autonomous driving and real-time obstacle detection. Nevertheless, these failures (rare events), being not adequately represented in historical training data, pose a challenge for data-driven RL algorithms. As it is impractical to adjust failure frequency in real-world applications for training, we introduce FIRE, a framework that adapts to rare events by training a RL policy in an edge computing digital twin environment. We propose ImRE, an importance sampling-based Q-learning algorithm, which samples rare events proportionally to their impact on the value function. FIRE considers delay, migration, failure, and backup placement costs across individual and shared service profiles. We prove ImRE's boundedness and convergence to optimality. Next, we introduce novel deep Q-learning (ImDQL) and actor critic (ImACRE) versions of our algorithm to enhance scalability. We extend our framework to accommodate users with varying risk tolerances. Through trace driven experiments, we show that FIRE reduces costs compared to vanilla RL and the greedy baseline in the event of failures.	公開日:2024-09-22 翻訳日:2024-11-09 15:35:37
# 分散強化学習におけるフィードバック分布の最適化 How Does Return Distribution in Distributional Reinforcement Learning Help Optimization? ( http://arxiv.org/abs/2209.14513v2 ) ライセンス: Link先を確認	Ke Sun, Bei Jiang, Linglong Kong,	(参考訳) 分散強化学習は、標準RLでの期待だけでなく、戻り分布全体を学習することに焦点を当てており、性能向上に顕著な成功を収めている。これらの進歩にもかかわらず、分布RL内の戻り分布の理解は依然として限られている。本研究では、ニューラルネットワークZ-Iteration~(Neural FZI)フレームワークにおいて、古典的RLにまたがる再帰分布知識を利用して、分布RLの最適化の利点を検討する。まず, 分布RLの分布損失は, 良好な滑らかさ特性を持ち, 最適化安定性を促進する傾向にある安定勾配を享受できることを実証する。さらに、戻り分布を分解することにより、分布RLの加速効果を明らかにする。分布RLは、各環境における勾配推定のばらつきによって、戻り分布近似が適切であれば好適に動作することを示す。厳密な実験は、分布RLの安定な最適化挙動とその加速効果を古典的RLと比較して検証する。本研究は,分布RLアルゴリズムの帰属分布が最適化にどう役立つかを明らかにする。 Distributional reinforcement learning, which focuses on learning the entire return distribution instead of only its expectation in standard RL, has demonstrated remarkable success in enhancing performance. Despite these advancements, our comprehension of how the return distribution within distributional RL still remains limited. In this study, we investigate the optimization advantages of distributional RL by utilizing its extra return distribution knowledge over classical RL within the Neural Fitted Z-Iteration~(Neural FZI) framework. To begin with, we demonstrate that the distribution loss of distributional RL has desirable smoothness characteristics and hence enjoys stable gradients, which is in line with its tendency to promote optimization stability. Furthermore, the acceleration effect of distributional RL is revealed by decomposing the return distribution. It shows that distributional RL can perform favorably if the return distribution approximation is appropriate, measured by the variance of gradient estimates in each environment. Rigorous experiments validate the stable optimization behaviors of distributional RL and its acceleration effects compared to classical RL. Our research findings illuminate how the return distribution in distributional RL algorithms helps the optimization.	公開日:2024-09-23 翻訳日:2024-11-09 15:35:37
# DICTDIS:改良NMTのための曖昧さを制限した辞書 DICTDIS: Dictionary Constrained Disambiguation for Improved NMT ( http://arxiv.org/abs/2210.06996v3 ) ライセンス: Link先を確認	Ayush Maheshwari, Preethi Jyothi, Ganesh Ramakrishnan,	(参考訳) ドメイン固有ニューラルマシン翻訳(NMT)システムは、多言語社会における多様なユーザ集合に情報をアクセスできるようにする可能性において、社会的に重要な存在である。このようなNMTシステムは、語彙的に制約され、ドメイン固有の辞書から引き出されることが望ましい。辞書は、単語の多文性のために、ソースワード/フレーズに対して複数の候補翻訳を提示することができる。次に、オンスはNMTモデル上で、文脈的に最も適切な候補を選択する。以前の作業ではこの問題をほとんど無視しており、ターゲット語やフレーズを単一の制約に置き換える単一の制約設定に重点を置いていた。本研究では辞書から派生した複数の候補翻訳の曖昧さを解消する語彙制約付きNMTシステムであるDictDisを提案する。我々は、複数の辞書候補とのトレーニングデータを増強し、複数の候補制約を暗黙的に調整することで、トレーニング中の曖昧さを積極的に促進する。我々は、規制、金融、工学を含む様々な分野において、英語・ヒンディー語・英語・ドイツ語文に関する広範な実験を通じて、DictDisの有用性を実証する。また、標準ベンチマークテストデータセットの比較も行う。語彙的に制約された非拘束NMTに対する既存のアプローチと比較して、制限されたコピーや曖昧さに関連するすべての領域に対する優れた性能を示し、また、いくつかの領域において最大2-3 BLEU点の周波数改善を得る。 Domain-specific neural machine translation (NMT) systems (e.g., in educational applications) are socially significant with the potential to help make information accessible to a diverse set of users in multilingual societies. It is desirable that such NMT systems be lexically constrained and draw from domain-specific dictionaries. Dictionaries could present multiple candidate translations for a source word/phrase due to the polysemous nature of words. The onus is then on the NMT model to choose the contextually most appropriate candidate. Prior work has largely ignored this problem and focused on the single candidate constraint setting wherein the target word or phrase is replaced by a single constraint. In this work we present DictDis, a lexically constrained NMT system that disambiguates between multiple candidate translations derived from dictionaries. We achieve this by augmenting training data with multiple dictionary candidates to actively encourage disambiguation during training by implicitly aligning multiple candidate constraints. We demonstrate the utility of DictDis via extensive experiments on English-Hindi and English-German sentences in a variety of domains including regulatory, finance, engineering. We also present comparisons on standard benchmark test datasets. In comparison with existing approaches for lexically constrained and unconstrained NMT, we demonstrate superior performance with respect to constraint copy and disambiguation related measures on all domains while also obtaining improved fluency of up to 2-3 BLEU points on some domains.	公開日:2024-09-27 翻訳日:2024-11-09 15:35:37
# 航空機エンジンブレードの知的欠陥検出のための超画素知覚グラフニューラルネットワーク Superpixel perception graph neural network for intelligent defect detection of aero-engine blade ( http://arxiv.org/abs/2210.07539v2 ) ライセンス: Link先を確認	Hongbing Shang, Qixiu Yang, Chuang Sun, Xuefeng Chen, Ruqiang Yan,	(参考訳) エアロエンジンは航空機や他の宇宙船のコアコンポーネントである。高速回転翼は空気を吸って完全に燃焼することで力を提供し、様々な欠陥が必然的に発生し、航空エンジンの運転安全性を脅かす。そのため、このような複雑なシステムには定期的な検査が不可欠である。しかしながら、ボアスコープ検査を行う既存の技術は、労働集約的で、時間がかかり、経験に依存している。特徴抽出のための多段階グラフ畳み込みネットワーク(MSGCN)と領域提案のための超画素知覚領域提案ネットワーク(SPRPN)を用いて,この技術を知能で実現するために,新しい超画素知覚グラフニューラルネットワーク(SPGNN)を提案する。まず、複雑な不規則なテクスチャをキャプチャするために、画像は一連のパッチに変換され、グラフ表現を得る。次に、複数のGCNブロックからなるMSGCNがグラフ構造の特徴を抽出し、グラフレベルでグラフ情報処理を行う。最後に、SPRPNは、グラフ表現特徴とスーパーピクセル知覚特徴を融合させて知覚境界ボックスを生成する。そのため,提案したSPGNNは,SPGNNパイプライン全体のグラフレベルにおいて,常に特徴抽出と情報伝達を実装し,受容野の減少と情報損失を軽減する。 SPGNNの有効性を検証するため,3000枚の画像を用いたシミュレートされたブレードデータセットを構築した。アルミニウムのパブリックデータセットは、異なる方法のパフォーマンスを検証するためにも使われる。実験結果から,提案したSPGNNは最先端手法と比較して優れた性能を示した。 Aero-engine is the core component of aircraft and other spacecraft. The high-speed rotating blades provide power by sucking in air and fully combusting, and various defects will inevitably occur, threatening the operation safety of aero-engine. Therefore, regular inspections are essential for such a complex system. However, existing traditional technology which is borescope inspection is labor-intensive, time-consuming, and experience-dependent. To endow this technology with intelligence, a novel superpixel perception graph neural network (SPGNN) is proposed by utilizing a multi-stage graph convolutional network (MSGCN) for feature extraction and superpixel perception region proposal network (SPRPN) for region proposal. First, to capture complex and irregular textures, the images are transformed into a series of patches, to obtain their graph representations. Then, MSGCN composed of several GCN blocks extracts graph structure features and performs graph information processing at graph level. Last but not least, the SPRPN is proposed to generate perceptual bounding boxes by fusing graph representation features and superpixel perception features. Therefore, the proposed SPGNN always implements feature extraction and information transmission at the graph level in the whole SPGNN pipeline, to alleviate the reduction of receptive field and information loss. To verify the effectiveness of SPGNN, we construct a simulated blade dataset with 3000 images. A public aluminum dataset is also used to validate the performances of different methods. The experimental results demonstrate that the proposed SPGNN has superior performance compared with the state-of-the-art methods.	公開日:2024-09-22 翻訳日:2024-11-09 15:35:37
# ベイジアンニューラルネットワークのためのデータサブサンプリング Data Subsampling for Bayesian Neural Networks ( http://arxiv.org/abs/2210.09141v2 ) ライセンス: Link先を確認	Eiji Kawasaki, Markus Holzmann, Lawrence Adu-Gyamfi,	(参考訳) Markov Chain Monte Carlo (MCMC)アルゴリズムは、ニューラルネットワークの後方サンプリングの困難に繋がる大規模なデータセットに対して、うまくスケールしない。本稿では,ベイジアン推論コンテキストにおけるバッチデータ(ミニバッチ)を用いて拡張性に対処する可能性を評価するアルゴリズムとして,Pentalty Bayesian Neural Networks - PBNNを提案する。 PBNNは、メトロポリス・ヘイスティングス・アルゴリズムの一般化の一環としてペナルティ項を組み込むことによって、他のナイーブ・サブサンプリング技術に固有のバイアスを回避する。既存のMCMCフレームワークとPBNNを統合することは容易であり、損失関数の分散は単に受け入れ確率を減少させるだけである。合成データとMNISTデータセットの代替サンプリング戦略を比較することで、PBNNは小さなミニバッチサイズであっても優れた予測性能が得られることを示した。 PBNNは,ミニバッチサイズの変化による予測分布のキャリブレーションを行い,予測過信を著しく低減する手法を提案する。 Markov Chain Monte Carlo (MCMC) algorithms do not scale well for large datasets leading to difficulties in Neural Network posterior sampling. In this paper, we propose Penalty Bayesian Neural Networks - PBNNs, as a new algorithm that allows the evaluation of the likelihood using subsampled batch data (mini-batches) in a Bayesian inference context towards addressing scalability. PBNN avoids the biases inherent in other naive subsampling techniques by incorporating a penalty term as part of a generalization of the Metropolis Hastings algorithm. We show that it is straightforward to integrate PBNN with existing MCMC frameworks, as the variance of the loss function merely reduces the acceptance probability. By comparing with alternative sampling strategies on both synthetic data and the MNIST dataset, we demonstrate that PBNN achieves good predictive performance even for small mini-batch sizes of data. We show that PBNN provides a novel approach for calibrating the predictive distribution by varying the mini-batch size, significantly reducing predictive overconfidence.	公開日:2024-09-23 翻訳日:2024-11-09 15:35:37
# 等変拡散モデルを用いた構造に基づく医薬品設計 Structure-based Drug Design with Equivariant Diffusion Models ( http://arxiv.org/abs/2210.13695v3 ) ライセンス: Link先を確認	Arne Schneuing, Charles Harris, Yuanqi Du, Kieran Didi, Arian Jamasb, Ilia Igashov, Weitao Du, Carla Gomes, Tom Blundell, Pietro Lio, Max Welling, Michael Bronstein, Bruno Correia,	(参考訳) SBDD(Structure-based drug design)は、タンパク質標的に高親和性と特異性に結合する小分子リガンドを設計することを目的としている。創発的SBDD法は、タンパク質標的と複雑な薬物の構造データを利用して、新しい薬物候補を提案する。これらのアプローチは通常、結合ポケットを使って1つの原子を自己回帰的に配置する。近年、拡散生成モデルの急増がこの領域に入り、自然リガンドの統計的性質をより忠実に捉えることを約束している。しかしながら、既存のほとんどの手法は、化合物のボトムアップ・デ・ノボ設計にのみ焦点をあてたり、タスク固有のモデルで他の薬物開発課題に取り組むことに焦点を当てている。後者は適切なデータセットのキュレーション、モデルの慎重なエンジニアリング、各タスクのスクラッチからのトレーニングを必要とする。ここでは,オフザシェルフ特性の最適化,明示的負の設計,着色による部分分子設計など,より広範な問題に対して,単一の事前学習拡散モデルを適用する方法を示す。本稿では,SBDDを3次元条件付き生成問題として定式化し,タンパク質ポケット上に条件付きリガンドを生成するSE(3)等価拡散モデルDiffSBDDを提案する。我々のサイリコ実験では、DiffSBDDが地上の真実データの統計を効果的に捉えていることが示されています。さらに、様々な計算量に応じて、生成した薬物候補を改善するために、追加の制約をどのように利用できるかを示す。これらの結果は, 拡散モデルが従来の手法よりも正確に構造データの複雑な分布を表現し, サンプリング戦略以外の設計目標や制約を組み込むことができるという仮定を支持している。 Structure-based drug design (SBDD) aims to design small-molecule ligands that bind with high affinity and specificity to pre-determined protein targets. Generative SBDD methods leverage structural data of drugs in complex with their protein targets to propose new drug candidates. These approaches typically place one atom at a time in an autoregressive fashion using the binding pocket as well as previously added ligand atoms as context in each step. Recently a surge of diffusion generative models has entered this domain which hold promise to capture the statistical properties of natural ligands more faithfully. However, most existing methods focus exclusively on bottom-up de novo design of compounds or tackle other drug development challenges with task-specific models. The latter requires curation of suitable datasets, careful engineering of the models and retraining from scratch for each task. Here we show how a single pre-trained diffusion model can be applied to a broader range of problems, such as off-the-shelf property optimization, explicit negative design, and partial molecular design with inpainting. We formulate SBDD as a 3D-conditional generation problem and present DiffSBDD, an SE(3)-equivariant diffusion model that generates novel ligands conditioned on protein pockets. Our in silico experiments demonstrate that DiffSBDD captures the statistics of the ground truth data effectively. Furthermore, we show how additional constraints can be used to improve the generated drug candidates according to a variety of computational metrics. These results support the assumption that diffusion models represent the complex distribution of structural data more accurately than previous methods, and are able to incorporate additional design objectives and constraints changing nothing but the sampling strategy.	公開日:2024-09-23 翻訳日:2024-11-09 15:35:37
# ソフトラベルプロトタイプを用いた事例から新しい課題を学習する Learning New Tasks from a Few Examples with Soft-Label Prototypes ( http://arxiv.org/abs/2210.17437v4 ) ライセンス: Link先を確認	Avyav Kumar Singh, Ekaterina Shutova, Helen Yannakoudakis,	(参考訳) 既存のNLPにおける少数ショット学習へのアプローチは、大言語モデル(LLM)および/またはこれらを微調整して、アウト・オブ・ディストリビューションデータの一般化に頼っている。そこで本研究では,入力領域における異なるクラスの分布を総合的に把握するソフトラベルのプロトタイプ(SLP)に基づく,新しい数発学習手法を提案する。本稿では,NLP タスクをクラスごとのごく少数の例 (4, 8, 16) から学習することに集中し,本手法がパラメータ効率が高く,テスト済みタスクの大部分に対して優れた性能を達成できることを実験的に実証する。また,本手法は,より汎用的な学習環境,主にメタラーニングに組み込むことで,強力なベースラインに対して優れた性能が得られることを示す。 Existing approaches to few-shot learning in NLP rely on large language models (LLMs) and/or fine-tuning of these to generalise on out-of-distribution data. In this work, we propose a novel few-shot learning approach based on soft-label prototypes (SLPs) designed to collectively capture the distribution of different classes across the input domain space. We focus on learning previously unseen NLP tasks from very few examples (4, 8, 16) per class and experimentally demonstrate that our approach achieves superior performance on the majority of tested tasks in this data-lean setting while being highly parameter efficient. We also show that our few-shot adaptation method can be integrated into more generalised learning settings, primarily meta-learning, to yield superior performance against strong baselines.	公開日:2024-09-22 翻訳日:2024-11-09 15:35:37
# 複素逆温度平面における量子臨界性のシグナチャ Signatures of quantum criticality in the complex inverse temperature plane ( http://arxiv.org/abs/2211.00813v2 ) ライセンス: Link先を確認	Yang Liu, Songtai Lv, Yang Yang, Haiyuan Zou,	(参考訳) 複素分割関数とフィッシャー零点の概念は、有限温度および実時間動的相転移に対する固有の統計メカニズムを提供する。我々はこれらの複雑化の効用を量子相転移に拡張する。線あるいは閉曲線上の異なるフィッシャー零点を正確に同定し、一次元横場イジングモデルに対する領域壁励起や制限中間子との対応を解明する。フィッシャー零点の交叉挙動は、励起エネルギースケールが定量的に決定される量子相転移付近の臨界性を示す魅力的な図である。さらに、テンソルネットワーク計算による結果を確認し、閉零曲線の破壊による分解中間子励起の明確な信号を示す。我々の結果は、量子相転移のためのフィッシャー零点の重要な特徴を明白に示し、量子臨界性を探るために新しい経路を開く。 Concepts of the complex partition functions and the Fisher zeros provide intrinsic statistical mechanisms for finite temperature and real time dynamical phase transitions. We extend the utility of these complexifications to quantum phase transitions. We exactly identify different Fisher zeros on lines or closed curves and elucidate their correspondence with domain-wall excitations or confined mesons for the one-dimensional transverse field Ising model. The crossover behavior of the Fisher zeros provides a fascinating picture for criticality near the quantum phase transition, where the excitation energy scales are quantitatively determined. We further confirm our results by tensor network calculations and demonstrate a clear signal of deconfined meson excitations from the disruption of the closed zero curves. Our results unambiguously show significant features of Fisher zeros for a quantum phase transition and open up a new route to explore quantum criticality.	公開日:2024-09-24 翻訳日:2024-11-09 15:35:37
# Solidago: モジュール型のコラボレーションスコーリングパイプライン Solidago: A Modular Collaborative Scoring Pipeline ( http://arxiv.org/abs/2211.01179v3 ) ライセンス: Link先を確認	Lê Nguyên Hoang, Romain Beylerian, Bérangère Colbois, Julien Fageot, Louis Faucon, Aidan Jungo, Alain Le Noac'h, Adrien Matissart, Oscar Villemaud,	(参考訳) 本稿では,任意のユーザコミュニティが任意のエンティティを共同でスコアすることを可能にする,エンドツーエンドのモジュールパイプラインであるSolidagoを提案する。 Solidagoは6つのモジュールの分解を提案している。まず、プリトラストとピアツーピアのブーチを使用して、信頼スコアをユーザーに割り当てる。第2に、参加に基づいて、信頼スコアは、エンティティごとのユーザ当たりの投票権に変換される。第3に、各ユーザに対して、ユーザの評価データから嗜好モデルを学ぶ。第4に、ユーザーのモデルは同様の規模に置かれる。第5に、これらのモデルは安全に集約されます。 6番目は、人間が読めるグローバルスコアを得るために後処理される。また、新しい信頼伝播アルゴリズム、最先端スケーリングおよび集約ソリューションの適応を含む6つのモジュールのデフォルト実装も提案する。当社のパイプラインはオープンソースプラットフォームである Tournesol.app にデプロイされています。これにより、あらゆる種類のエンティティの協調的、効果的、スケーラブル、公正、解釈可能、セキュアなスコアリングのための魅力的な基盤を築きます。 This paper presents Solidago, an end-to-end modular pipeline to allow any community of users to collaboratively score any number of entities. Solidago proposes a six-module decomposition. First, it uses pretrust and peer-to-peer vouches to assign trust scores to users. Second, based on participation, trust scores are turned into voting rights per user per entity. Third, for each user, a preference model is learned from the user's evaluation data. Fourth, users' models are put on a similar scale. Fifth, these models are securely aggregated. Sixth, models are post-processed to yield human-readable global scores. We also propose default implementations of the six modules, including a novel trust propagation algorithm, and adaptations of state-of-the-art scaling and aggregation solutions. Our pipeline has been successfully deployed on the open-source platform tournesol.app. We thereby lay an appealing foundation for the collaborative, effective, scalable, fair, interpretable and secure scoring of any set of entities.	公開日:2024-09-25 翻訳日:2024-11-09 15:35:37
# 解釈型機械学習を用いたIctal-Interictal-Injull Continuumにおける脳波パターン分類における臨床成績の改善 Improving Clinician Performance in Classification of EEG Patterns on the Ictal-Interictal-Injury Continuum using Interpretable Machine Learning ( http://arxiv.org/abs/2211.05207v5 ) ライセンス: Link先を確認	Alina Jade Barnett, Zhicheng Guo, Jin Jing, Wendong Ge, Peter W. Kaplan, Wan Yee Kong, Ioannis Karakis, Aline Herlopian, Lakshman Arcot Jayagopal, Olga Taraschenko, Olga Selioutski, Gamaleldin Osman, Daniel Goldenholz, Cynthia Rudin, M. Brandon Westover,	(参考訳) 集中治療室(ICUs)では、重度の脳損傷を防ぐために、重度疾患のある患者は脳波(EEGs)で監視される。モニター可能な患者の数は、訓練された医師が脳波を読むために利用できることによって制限され、脳波の解釈は主観的であり、サーバ間の変動が難しくなる。脳波のための自動ディープラーニングシステムは、人間のバイアスを減らし、診断プロセスを加速する。しかし、ブラックボックスのディープラーニングモデルは信頼できない、トラブルシューティングが難しい、現実のアプリケーションでは説明責任が欠如しているため、臨床医による信頼と採用の欠如につながっている。これらの課題に対処するために、有害な脳波パターンの存在を予測するだけでなく、その決定に関する高品質なケースベース説明を提供する、解釈可能な新しいディープラーニングモデルを提案する。我々のモデルは解釈可能であることを制約されているにもかかわらず、対応するブラックボックスモデルよりも優れた性能を発揮する。学習した2次元埋め込み空間は、頭蓋内損傷連続体脳波パターンの構造に関する最初の大域的概要を提供する。我々のモデルがどのように決定に達したかを理解する能力は、臨床医が有害な脳活動の診断と治療をより正確に行うのに役立つだけでなく、臨床実践における機械学習モデルの信頼と採用を高めるのに役立つ。 In intensive care units (ICUs), critically ill patients are monitored with electroencephalograms (EEGs) to prevent serious brain injury. The number of patients who can be monitored is constrained by the availability of trained physicians to read EEGs, and EEG interpretation can be subjective and prone to inter-observer variability. Automated deep learning systems for EEG could reduce human bias and accelerate the diagnostic process. However, black box deep learning models are untrustworthy, difficult to troubleshoot, and lack accountability in real-world applications, leading to a lack of trust and adoption by clinicians. To address these challenges, we propose a novel interpretable deep learning model that not only predicts the presence of harmful brainwave patterns but also provides high-quality case-based explanations of its decisions. Our model performs better than the corresponding black box model, despite being constrained to be interpretable. The learned 2D embedded space provides the first global overview of the structure of ictal-interictal-injury continuum brainwave patterns. The ability to understand how our model arrived at its decisions will not only help clinicians to diagnose and treat harmful brain activities more accurately but also increase their trust and adoption of machine learning models in clinical practice; this could be an integral component of the ICU neurologists' standard workflow.	公開日:2024-09-25 翻訳日:2024-11-09 15:35:37
# スパースディープニューラルネットワークアーキテクチャのための適応的・安定的階層的学習手法 An Adaptive and Stability-Promoting Layerwise Training Approach for Sparse Deep Neural Network Architecture ( http://arxiv.org/abs/2211.06860v2 ) ライセンス: Link先を確認	C G Krishnanunni, Tan Bui-Thanh,	(参考訳) この研究は、与えられたトレーニングデータセットに対してうまく一般化するディープニューラルネットワーク(DNN)アーキテクチャを段階的に開発するための2段階適応フレームワークを提案する。第1段階では、新しいレイヤを毎回追加し、前のレイヤでパラメータを凍結することで独立してトレーニングする、レイヤワイズトレーニングアプローチが採用されている。我々は、多様体正則化、スパーシティ正則化、物理インフォームド項を用いることで、DNNに望ましい構造を課す。本稿では, 学習アルゴリズムの望ましい特性として, エプシロン・デルタ安定促進の概念を導入し, 多様体正規化を用いることで, エプシロン・デルタ安定促進アルゴリズムが得られることを示す。さらに,新たに加えた層をトレーニングするために必要な条件を導出し,トレーニング飽和問題について検討する。アルゴリズムの第2段(後処理)では、浅いネットワークのシーケンスを用いて、第1段で生成された残差から情報を抽出し、予測精度を向上させる。試行錯誤問題と分類問題に関する数値的研究により,提案手法が同一サイズの完全連結DNNより優れていることを示す。さらに、物理インフォームドニューラルネットワーク(PINN)に偏微分方程式を解くための適応型アーキテクチャ戦略を組み込むことにより、適応型PINNは標準のPINNよりも優れているだけでなく、証明可能な安定性を持つ解釈可能な隠蔽層を生成することを数値的に示す。また, 楕円偏微分方程式に支配される逆問題の解法として, アーキテクチャ設計戦略を適用した。 This work presents a two-stage adaptive framework for progressively developing deep neural network (DNN) architectures that generalize well for a given training data set. In the first stage, a layerwise training approach is adopted where a new layer is added each time and trained independently by freezing parameters in the previous layers. We impose desirable structures on the DNN by employing manifold regularization, sparsity regularization, and physics-informed terms. We introduce a epsilon-delta stability-promoting concept as a desirable property for a learning algorithm and show that employing manifold regularization yields a epsilon-delta stability-promoting algorithm. Further, we also derive the necessary conditions for the trainability of a newly added layer and investigate the training saturation problem. In the second stage of the algorithm (post-processing), a sequence of shallow networks is employed to extract information from the residual produced in the first stage, thereby improving the prediction accuracy. Numerical investigations on prototype regression and classification problems demonstrate that the proposed approach can outperform fully connected DNNs of the same size. Moreover, by equipping the physics-informed neural network (PINN) with the proposed adaptive architecture strategy to solve partial differential equations, we numerically show that adaptive PINNs not only are superior to standard PINNs but also produce interpretable hidden layers with provable stability. We also apply our architecture design strategy to solve inverse problems governed by elliptic partial differential equations.	公開日:2024-09-22 翻訳日:2024-11-09 15:35:37
# 量子コンピュータにおける振動構造の測定回数の最適化:座標と測定方法 Optimizing the number of measurements for vibrational structure on quantum computers: coordinates and measurement schemes ( http://arxiv.org/abs/2211.11615v2 ) ライセンス: Link先を確認	Marco Majland, Rasmus Berg Jensen, Mads Greisen Højlund, Nikolaj Thomas Zinner, Ove Christiansen,	(参考訳) 短期デバイスに対する実用的な量子優位性の実証を禁止している主な課題の1つは、基底状態エネルギーなどの関連する物理量の推定に過剰な測定オーバーヘッドがかかることである。しかし、分子の電子的構造と振動的構造に大きな違いがあるため、計算アンハーモニック、振動状態の資源要求をいかに減らすかという問題は、電子的構造よりも比較的未解明のままである。重要なことに、ボゾン交換関係、区別可能なヒルベルト空間、振動座標は、資源要求を最小化するために活用できる振動系の操作を可能にする。本研究では, 種々の3モード(6モード)分子の無調波, 振動状態の推定に必要な測定値に対する, 異なる座標系と測定方法の影響について検討する。従来の振動構造プログラムから立方体ハミルトニアンの自動構成に基づいて, 座標変換による測定回数の削減を図り, 最大7倍(2.5倍)の3倍(1.5倍)の平均値を示す。 One of the primary challenges prohibiting demonstrations of practical quantum advantages for near-term devices amounts to excessive measurement overheads for estimating relevant physical quantities such as ground state energies. However, with major differences between the electronic and vibrational structure of molecules, the question of how the resource requirements of computing anharmonic, vibrational states can be reduced remains relatively unexplored compared to its electronic counterpart. Importantly, bosonic commutation relations, distinguishable Hilbert spaces and vibrational coordinates allow manipulations of the vibrational system that can be exploited to minimize resource requirements. In this work, we investigate the impact of different coordinate systems and measurement schemes on the number of measurements needed to estimate anharmonic, vibrational states for a variety of three-mode (six-mode) molecules. We demonstrate an average of 3-fold (1.5-fold), with up to 7-fold (2.5-fold), reduction in the number of measurements required by employing appropriate coordinate transformations, based on an automized construction of qubit Hamiltonians from a conventional vibrational structure program.	公開日:2024-09-24 翻訳日:2024-11-09 15:35:37
# 太陽と空の下のビデオケースシャドウ検出 Video Instance Shadow Detection Under the Sun and Sky ( http://arxiv.org/abs/2211.12827v3 ) ライセンス: Link先を確認	Zhenghao Xing, Tianyu Wang, Xiaowei Hu, Haoran Wu, Chi-Wing Fu, Pheng-Ann Heng,	(参考訳) 写真編集や光方向推定などのアプリケーションに不可欠なインスタンスのシャドー検出は、シャドーインスタンス、オブジェクトインスタンス、およびそれらの関連性を予測する上で大きな進歩を遂げている。このタスクの動画への拡張は、様々なビデオデータに注釈を付けることや、協会内の隠蔽や一時的な消滅に起因する複雑さに対処することの課題を示す。これらの課題に対応するために、ラベル付き画像データとラベルなしビデオデータの両方を活用する半教師付きビデオインスタンスシャドウ検出フレームワークViShadowを紹介した。 ViShadowは2段階のトレーニングパイプラインを備えている。第1ステージはラベル付きイメージデータを利用して、クロスフレームペアリングのための対照的な学習を通じて、シャドーとオブジェクトインスタンスを識別する。第2段階ではラベルのないビデオが採用され、追跡能力を高めるために関連するサイクル一貫性の損失が組み込まれている。一時的な消失を管理し、追跡継続性を確保するための検索機構が導入された。ラベル付きトレーニングビデオとラベル付きテストビデオと、SOAP-VIDメトリックを含むSOBA-VIDデータセットを、VISDソリューションの定量的評価のために導入する。 ViShadowの有効性は、ビデオインペインティング、インスタンスクローン、シャドウ編集、テキストインストラクションされたシャドウオブジェクト操作など、様々なビデオレベルのアプリケーションを通じてさらに実証されている。 Instance shadow detection, crucial for applications such as photo editing and light direction estimation, has undergone significant advancements in predicting shadow instances, object instances, and their associations. The extension of this task to videos presents challenges in annotating diverse video data and addressing complexities arising from occlusion and temporary disappearances within associations. In response to these challenges, we introduce ViShadow, a semi-supervised video instance shadow detection framework that leverages both labeled image data and unlabeled video data for training. ViShadow features a two-stage training pipeline: the first stage, utilizing labeled image data, identifies shadow and object instances through contrastive learning for cross-frame pairing. The second stage employs unlabeled videos, incorporating an associated cycle consistency loss to enhance tracking ability. A retrieval mechanism is introduced to manage temporary disappearances, ensuring tracking continuity. The SOBA-VID dataset, comprising unlabeled training videos and labeled testing videos, along with the SOAP-VID metric, is introduced for the quantitative evaluation of VISD solutions. The effectiveness of ViShadow is further demonstrated through various video-level applications such as video inpainting, instance cloning, shadow editing, and text-instructed shadow-object manipulation.	公開日:2024-09-24 翻訳日:2024-11-09 15:35:37
# オンデバイストレーニング: 既存のシステムに関する最初の概要 On-device Training: A First Overview on Existing Systems ( http://arxiv.org/abs/2212.00824v3 ) ライセンス: Link先を確認	Shuai Zhu, Thiemo Voigt, JeongGil Ko, Fatemeh Rahimian,	(参考訳) 機械学習(ML)とディープラーニング(DL)の最近のブレークスルーは、幅広いアプリケーションドメインにまたがる様々なインテリジェントシステムの設計と開発を触媒している。既存の機械学習モデルは、大きなメモリと計算能力を必要とするが、リソースに制約のあるデバイスにも、いくつかのモデルをデプロイする努力が続けられている。初期のアプリケーションシステムの大半はMLとDLモデルの推論機能を活用することに重点を置いており、さまざまなモバイルおよび組み込みセンシングコンポーネントから取得したデータは、分類やセグメンテーションといったアプリケーション目標のためにこれらのモデルを通して処理される。最近では、ML/DLモデルトレーニングにモバイルおよび組み込みコンピューティングリソースを活用するという概念が注目されている。 (i)無線リンクを介してデータを共有することなく、ローカルデータを介してモデルのトレーニングを行うことにより、設計によるプライバシ保護計算を可能にする。二モデルパーソナライズ及び環境適応、及び (二)インターネット接続を安定させることなく、遠隔かつアクセスし難い場所に正確なモデルを配置すること。この研究は、デバイス上でのモデルトレーニングを可能にする最先端のシステム研究の要約と分析を目標とし、システムの観点からデバイス上でのトレーニングに関する調査を提供する。 The recent breakthroughs in machine learning (ML) and deep learning (DL) have catalyzed the design and development of various intelligent systems over wide application domains. While most existing machine learning models require large memory and computing power, efforts have been made to deploy some models on resource-constrained devices as well. A majority of the early application systems focused on exploiting the inference capabilities of ML and DL models, where data captured from different mobile and embedded sensing components are processed through these models for application goals such as classification and segmentation. More recently, the concept of exploiting the mobile and embedded computing resources for ML/DL model training has gained attention, as such capabilities allow (i) the training of models via local data without the need to share data over wireless links, thus enabling privacy-preserving computation by design, (ii) model personalization and environment adaptation, and (ii) deployment of accurate models in remote and hardly accessible locations without stable internet connectivity. This work targets to summarize and analyze state-of-the-art systems research that allows such on-device model training capabilities and provide a survey of on-device training from a systems perspective.	公開日:2024-09-23 翻訳日:2024-11-09 15:35:37
# モバイルアプリケーションにおけるAI技術に関する実証的研究 An Empirical Study of AI Techniques in Mobile Applications ( http://arxiv.org/abs/2212.01635v3 ) ライセンス: Link先を確認	Yinghua Li, Xueqi Dang, Haoye Tian, Tiezhu Sun, Zhijie Wang, Lei Ma, Jacques Klein, Tegawendé F. Bissyandé,	(参考訳) モバイルアプリケーションへの人工知能(AI)の統合は、さまざまなドメインを大きく変え、ユーザエクスペリエンスを高め、高度な機械学習(ML)とディープラーニング(DL)技術を通じてパーソナライズされたサービスを提供する。 AI駆動のモバイルアプリは通常、ML/DL技術を活用して画像認識や自然言語処理などの重要なタスクを実行するアプリケーションを指す。本稿では、デバイス上でのMLアプリ、デバイス上でのDLアプリ、AIサービスをサポートする(クラウドベースの)アプリなど、AIアプリケーションに関する最も広範な実証的研究を行った。私たちの研究は、56,682の現実世界のAIアプリケーションを含み、3つの重要な視点に焦点を当てている。 1)AIアプリの人気を分析し、AIアプリの更新状況を調査するアプリケーション分析。 2)AIフレームワークの使用状況とAIモデル保護を分析するフレームワークとモデル分析。 3)ユーザプライバシ保護とユーザレビューの態度を検討するユーザ分析を行った。私たちの研究は、AIアプリ開発者、ユーザ、AI R\&Dに強く影響しています。ひとつは、モバイルアプリケーションにおけるAI統合の増加傾向に注目し、さまざまなAIフレームワークやモデルが広く採用されていることを示しています。一方,アプリセキュリティを強化するために,堅牢なモデル保護の必要性が指摘されている。さらに、ユーザプライバシの重要性を強調し、現在のAIアプリで使用されているAIテクノロジに対するユーザの態度を示す。私たちは、モバイルアプリケーションで使用されるAIテクノロジに関する将来の研究のためのオープンソースリソースとして、AIアプリデータセット(現在、最も広範なAIアプリデータセット)を提供しています。 The integration of artificial intelligence (AI) into mobile applications has significantly transformed various domains, enhancing user experiences and providing personalized services through advanced machine learning (ML) and deep learning (DL) technologies. AI-driven mobile apps typically refer to applications that leverage ML/DL technologies to perform key tasks such as image recognition and natural language processing. In this paper, we conducted the most extensive empirical study on AI applications, exploring on-device ML apps, on-device DL apps, and AI service-supported (cloud-based) apps. Our study encompasses 56,682 real-world AI applications, focusing on three crucial perspectives: 1) Application analysis, where we analyze the popularity of AI apps and investigate the update states of AI apps; 2) Framework and model analysis, where we analyze AI framework usage and AI model protection; 3) User analysis, where we examine user privacy protection and user review attitudes. Our study has strong implications for AI app developers, users, and AI R\&D. On one hand, our findings highlight the growing trend of AI integration in mobile applications, demonstrating the widespread adoption of various AI frameworks and models. On the other hand, our findings emphasize the need for robust model protection to enhance app security. Additionally, our study highlights the importance of user privacy and presents user attitudes towards the AI technologies utilized in current AI apps. We provide our AI app dataset (currently the most extensive AI app dataset) as an open-source resource for future research on AI technologies utilized in mobile applications.	公開日:2024-09-27 翻訳日:2024-11-09 15:35:37
# CURO:相対的オーバージェネレーションのためのカリキュラム学習 CURO: Curriculum Learning for Relative Overgeneralization ( http://arxiv.org/abs/2212.02733v3 ) ライセンス: Link先を確認	Lin Shi, Qiyuan Liu, Bei Peng,	(参考訳) 相対的過一般化(英: Relative Over generalization, RO)は、最適関節作用の効用が準最適関節作用の効用より下降した場合に、協調的マルチエージェントタスクで生じる病理である。 ROは、エージェントを局所的な最適状態に陥れさせるか、あるいは特定の時間内にエージェント間の重要な調整を必要とする協調的なタスクを解くのに失敗する。本研究では、マルチエージェント強化学習(MARL)において、値ベースアルゴリズムとポリシー勾配アルゴリズムの両方がROに悩まされ、効果的なコーディネーションポリシーを学習できないことを実証的に見出した。 ROを克服するために,相対的オーバージェネリゼーション(CURO)のためのカリキュラム学習という新しい手法を提案する。強力なROを示すターゲットタスクを解決するため,CUROではまず目標タスクの報酬関数を微調整し,エージェントを訓練するためのソースタスクを生成する。そこで我々は,あるタスクにおいて得られた知識を効率よく次のタスクに転送するために,値関数転送とバッファ転送を組み合わせた伝達学習手法を用いて,目的タスクのより効率的な探索を可能にする。 CUROは一般的に、値ベースおよびポリシー勾配MARL法の両方に適用できる。 QMIX, HAPPO, HATRPOに適用した場合, CUROは重大ROを克服し, 性能を向上し, 多様な協調型マルチエージェントタスクにおいて, ベースライン法より優れていることを示す。 Relative overgeneralization (RO) is a pathology that can arise in cooperative multi-agent tasks when the optimal joint action's utility falls below that of a sub-optimal joint action. RO can cause the agents to get stuck into local optima or fail to solve cooperative tasks requiring significant coordination between agents within a given timestep. In this work, we empirically find that, in multi-agent reinforcement learning (MARL), both value-based and policy gradient MARL algorithms can suffer from RO and fail to learn effective coordination policies. To better overcome RO, we propose a novel approach called curriculum learning for relative overgeneralization (CURO). To solve a target task that exhibits strong RO, in CURO, we first fine-tune the reward function of the target task to generate source tasks to train the agent. Then, to effectively transfer the knowledge acquired in one task to the next, we use a transfer learning method that combines value function transfer with buffer transfer, which enables more efficient exploration in the target task. CURO is general and can be applied to both value-based and policy gradient MARL methods. We demonstrate that, when applied to QMIX, HAPPO, and HATRPO, CURO can successfully overcome severe RO, achieve improved performance, and outperform baseline methods in a variety of challenging cooperative multi-agent tasks.	公開日:2024-09-23 翻訳日:2024-11-09 15:35:37
# テンソル分解によるグラフニューラルネットワークの効率的な関係認識近傍集約 Efficient Relation-aware Neighborhood Aggregation in Graph Neural Networks via Tensor Decomposition ( http://arxiv.org/abs/2212.05581v4 ) ライセンス: Link先を確認	Peyman Baghershahi, Reshad Hosseini, Hadi Moradi,	(参考訳) 知識グラフ埋め込み(KGE)の課題に取り組むために,多数のグラフニューラルネットワーク(GNN)が開発された。しかし、これらのアプローチの多くは、関係情報の重要な役割を見落とし、エンティティ情報と不十分に統合し、表現力は低下する。本稿では,リレーショナルグラフ畳み込みネットワーク(R-GCN)の集約関数にテンソル分解を組み込んだ新しい知識グラフエンコーダを提案する。我々のモデルは、関係型によって定義される低ランクテンソルの射影行列を用いて、隣り合う実体の表現を強化する。このアプローチはマルチタスク学習を容易にし、関係認識表現を生成する。さらに、CP分解によるコアテンソルの低ランク推定手法を導入し、モデルを効果的に圧縮・正規化する。コントラスト学習にインスパイアされたトレーニング戦略を採用し,グラフ処理に固有の1-N法のトレーニング制限を緩和する。私たちはFB15k-237とWN18RRという2つの一般的なベンチマークデータセットにおいて、エンティティとリレーションのために低次元の埋め込みを使用しながら、競合のすべてを上回っました。 Numerous Graph Neural Networks (GNNs) have been developed to tackle the challenge of Knowledge Graph Embedding (KGE). However, many of these approaches overlook the crucial role of relation information and inadequately integrate it with entity information, resulting in diminished expressive power. In this paper, we propose a novel knowledge graph encoder that incorporates tensor decomposition within the aggregation function of Relational Graph Convolutional Network (R-GCN). Our model enhances the representation of neighboring entities by employing projection matrices of a low-rank tensor defined by relation types. This approach facilitates multi-task learning, thereby generating relation-aware representations. Furthermore, we introduce a low-rank estimation technique for the core tensor through CP decomposition, which effectively compresses and regularizes our model. We adopt a training strategy inspired by contrastive learning, which relieves the training limitation of the 1-N method inherent in handling vast graphs. We outperformed all our competitors on two common benchmark datasets, FB15k-237 and WN18RR, while using low-dimensional embeddings for entities and relations.	公開日:2024-09-21 翻訳日:2024-11-09 15:35:37
# Z-SSMNet : Bi-parametric MRIによる前立腺癌検出と診断のためのゾーナル・アウェア自己監督メッシュネットワーク Z-SSMNet: Zonal-aware Self-supervised Mesh Network for Prostate Cancer Detection and Diagnosis with Bi-parametric MRI ( http://arxiv.org/abs/2212.05808v2 ) ライセンス: Link先を確認	Yuan Yuan, Euijoon Ahn, Dagan Feng, Mohamad Khadra, Jinman Kim,	(参考訳) 臨床的に有意な前立腺癌(csPCa)の検出と診断において,bi-parametric magnetic resonance imaging (bpMRI)が重要なモダリティとなっている。 bpMRIを用いてcsPCaを識別するAIベースのシステムを開発することで、効率性とコスト効率を向上させることにより、PCa管理を変革することができる。しかし、畳み込みニューラルネットワーク(CNN)を用いた現在の最先端手法は、異方性画像から平面内および三次元空間情報を学習する際に限られている。それらのパフォーマンスは、大きく、多様で、よく注釈付けされたbpMRIデータセットの可用性にも依存する。本研究では,多次元(2D/2.5D/3D)畳み込みを適応的に統合し,高密度なスライス情報と異方性bpMRIのスライス間情報をバランスよく学習するZ-SSMNetを提案する。 bpMRIの外観,テクスチャ,構造を学習するために,大規模未ラベルデータを用いてネットワークを事前学習するための自己教師付き学習(SSL)手法を提案する。トレーニング前の段階で、スライス内情報とスライス間情報の両方をキャプチャすることを目的としている。さらに,我々は,csPCaの検出・診断能力をさらに向上するため,粒子解剖学的領域に集中するようにネットワークを拘束した。 10000以上のマルチセンターデータとマルチスキャナデータからなるPI-CAIデータセットについて広範な実験を行った。 Z-SSMNetは病変レベルの診断(APスコア0.633)と患者レベルの診断(AUROCスコア0.881)の両方に優れ,PI-CAIチャレンジのオープン開発フェーズにおけるトップ位置を確保し,APスコア0.690とAUROCスコア0.909を達成し,クローズドテストフェーズにおける第2位の地位を確保した。 Bi-parametric magnetic resonance imaging (bpMRI) has become a pivotal modality in the detection and diagnosis of clinically significant prostate cancer (csPCa). Developing AI-based systems to identify csPCa using bpMRI can transform PCa management by improving efficiency and cost-effectiveness. However, current state-of-the-art methods using convolutional neural networks (CNNs) are limited in learning in-plane and three-dimensional spatial information from anisotropic images. Their performances also depend on the availability of large, diverse, and well-annotated bpMRI datasets. We propose a Zonal-aware Self-supervised Mesh Network (Z-SSMNet) that adaptively integrates multi-dimensional (2D/2.5D/3D) convolutions to learn dense intra-slice information and sparse inter-slice information of the anisotropic bpMRI in a balanced manner. A self-supervised learning (SSL) technique is proposed to pre-train our network using large-scale unlabeled data to learn the appearance, texture, and structure semantics of bpMRI. It aims to capture both intra-slice and inter-slice information during the pre-training stage. Furthermore, we constrained our network to focus on the zonal anatomical regions to further improve the detection and diagnosis capability of csPCa. We conducted extensive experiments on the PI-CAI dataset comprising 10000+ multi-center and multi-scanner data. Our Z-SSMNet excelled in both lesion-level detection (AP score of 0.633) and patient-level diagnosis (AUROC score of 0.881), securing the top position in the Open Development Phase of the PI-CAI challenge and maintained strong performance, achieving an AP score of 0.690 and an AUROC score of 0.909, and securing the second-place ranking in the Closed Testing Phase.	公開日:2024-09-22 翻訳日:2024-11-09 15:35:37
# 大規模言語モデルにおけるグラフ学習とその発展 Graph Learning and Its Advancements on Large Language Models: A Holistic Survey ( http://arxiv.org/abs/2212.08966v5 ) ライセンス: Link先を確認	Shaopeng Wei, Jun Wang, Yu Zhao, Xingyan Chen, Qing Li, Fuzhen Zhuang, Ji Liu, Fuji Ren, Gang Kou,	(参考訳) グラフ学習は、ノード間の複雑な関係とグラフのトポロジ的構造を学習する試みである。長年にわたり、グラフ学習はグラフ理論からグラフデータマイニングへと移行してきた。表現学習の出現により、多様なシナリオにおいて顕著なパフォーマンスを達成した。幅広い応用の見通しから、グラフ学習には注意が集まっている。一部の研究者はグラフ学習に関する見事な調査を達成しているが、関連する目的や方法、アプリケーションをより一貫性のある方法で結びつけることに失敗した。その結果、グラフ学習の急速な拡大により、現在の十分なシナリオや課題は含まれなかった。特に、大規模言語モデルは近年、人間の生活に破壊的な影響を与えてきたが、構造化シナリオの相対的な弱点も示している。グラフ学習でこれらのモデルをいかに強力にするかという問題は、まだ未解決のままだ。我々の調査は、グラフ学習と事前訓練された言語モデルの統合における最新の進歩に焦点を当て、特に大規模言語モデルの領域におけるそれらの応用を強調した。グラフ学習に関するこれまでの調査とは違って、グラフ構造の観点から現在の研究を分析し、グラフ学習における最新のアプリケーション、トレンド、課題について論じる総合的なレビューを提供する。具体的には、分類学を提案し、それからグラフ学習の手法を要約する。次に、メインストリームアプリケーションの詳細な解明を提供します。最後に,今後の方向性を提案する。 Graph learning is a prevalent domain that endeavors to learn the intricate relationships among nodes and the topological structure of graphs. Over the years, graph learning has transcended from graph theory to graph data mining. With the advent of representation learning, it has attained remarkable performance in diverse scenarios. Owing to its extensive application prospects, graph learning attracts copious attention. While some researchers have accomplished impressive surveys on graph learning, they failed to connect related objectives, methods, and applications in a more coherent way. As a result, they did not encompass current ample scenarios and challenging problems due to the rapid expansion of graph learning. Particularly, large language models have recently had a disruptive effect on human life, but they also show relative weakness in structured scenarios. The question of how to make these models more powerful with graph learning remains open. Our survey focuses on the most recent advancements in integrating graph learning with pre-trained language models, specifically emphasizing their application within the domain of large language models. Different from previous surveys on graph learning, we provide a holistic review that analyzes current works from the perspective of graph structure, and discusses the latest applications, trends, and challenges in graph learning. Specifically, we commence by proposing a taxonomy and then summarize the methods employed in graph learning. We then provide a detailed elucidation of mainstream applications. Finally, we propose future directions.	公開日:2024-09-21 翻訳日:2024-11-09 15:35:37
# 政策学習の「無」重複:ペシミズムと経験的バーンスタインの不平等の一般化 Policy learning "without" overlap: Pessimism and generalized empirical Bernstein's inequality ( http://arxiv.org/abs/2212.09900v3 ) ライセンス: Link先を確認	Ying Jin, Zhimei Ren, Zhuoran Yang, Zhaoran Wang,	(参考訳) 本論文は, 偏在政策学習において, 事前に収集した事前観測(固定的あるいは適応的に進化する行動方針)を活用して, 与えられた集団に最適な総合的な結果をもたらす最適な個別化決定ルールを学習することを目的とした。既存の政策学習法は、一様重なりの仮定、すなわち、全ての個々の特性に対する全ての作用を探索する正当性は、境界を低くしなければならない。データ収集プロセスをコントロールすることができないため、この仮定は多くの状況において非現実的になり得る。本稿では,政策値の点推定の代わりに低信頼境界(LCB)を最適化する新しいアルゴリズムであるPPLを提案する。 LCBは、オフラインデータを収集するための行動ポリシーの知識を用いて構築される。均一な重なり条件を仮定せずに、我々はアルゴリズムの準最適性に対するデータ依存上界を確立する。一最適方針の重複、及び (ii) 最適化したポリシークラスの複雑さ。すなわち、適応的に収集されたデータに対して、最適動作の確率が時間とともに低い限り、効率的なポリシー学習を確保する一方、最適動作の確率は任意に高速に減少する。理論解析において、逆正当性重み付け推定器のための新しい自己正規化型濃度不等式を開発し、よく知られた経験的ベルンシュタインの不等式を非有界および非非非等式データに一般化する。我々はPPLの有効性を実証する広範囲なシミュレーション研究や実世界の応用と同様に、偏極化とポリシーツリー探索による効率的な最適化アルゴリズムを用いて、我々の理論を補完する。 This paper studies offline policy learning, which aims at utilizing observations collected a priori (from either fixed or adaptively evolving behavior policies) to learn an optimal individualized decision rule that achieves the best overall outcomes for a given population. Existing policy learning methods rely on a uniform overlap assumption, i.e., the propensities of exploring all actions for all individual characteristics must be lower bounded. As one has no control over the data collection process, this assumption can be unrealistic in many situations, especially when the behavior policies are allowed to evolve over time with diminishing propensities for certain actions. In this paper, we propose Pessimistic Policy Learning (PPL), a new algorithm that optimizes lower confidence bounds (LCBs) -- instead of point estimates -- of the policy values. The LCBs are constructed using knowledge of the behavior policies for collecting the offline data. Without assuming any uniform overlap condition, we establish a data-dependent upper bound for the suboptimality of our algorithm, which only depends on (i) the overlap for the optimal policy, and (ii) the complexity of the policy class we optimize over. As an implication, for adaptively collected data, we ensure efficient policy learning as long as the propensities for optimal actions are lower bounded over time, while those for suboptimal ones are allowed to diminish arbitrarily fast. In our theoretical analysis, we develop a new self-normalized type concentration inequality for inverse-propensity-weighting estimators, generalizing the well-known empirical Bernstein's inequality to unbounded and non-i.i.d. data. We complement our theory with an efficient optimization algorithm via Majorization-Minimization and policy tree search, as well as extensive simulation studies and real-world applications that demonstrate the efficacy of PPL.	公開日:2024-09-26 翻訳日:2024-11-09 15:35:37
# 局所駆動型量子磁石の空間熱化 Real space thermalization of locally driven quantum magnets ( http://arxiv.org/abs/2212.13790v2 ) ライセンス: Link先を確認	Ronald Melendrez, Bhaskar Mukherjee, Prakash Sharma, Arijeet Pal, Hitesh J. Changlani,	(参考訳) 孤立系における熱化とその分解の研究は、非平衡量子状態とその初期状態への依存性の深い理解につながった。初期状態の役割は、量子多体散乱(英語版)の存在によって顕著に強調され、基礎となる効果的なスーパースピン構造を持つ特別な熱水状態は、他のカオス多体スペクトルに埋め込まれている。スピン・ハイゼンベルクと$XXZ$モデルとその一次元および高次元の変種は、正確な量子多体傷を負い、合成および凝縮物質系において実現可能なスピンヘリックス状態の完全な復活を示すことが示されている。これらの進歩に触発されて、空間熱化プロファイルを探索し、システムの異なる部位がスーパースピンの寿命にどのように影響するかを明らかにするために、実験的にアクセス可能で、局所的、時間に依存したプロトコルを提案する。我々は、駆動スピンと他のスピンとの相互作用に基づいて、強磁性(X$偏極)初期状態の異なるパラメトリックな状態を特定する。また,スーパースピンが長時間の局所運転に対して回復力を持つパラメータ機構も同定する。数値観測を解説した実空間図とフロケット空間図を作成し,様々な実験装置で検証可能な予測を行う。 The study of thermalization and its breakdown in isolated systems has led to a deeper understanding of non-equilibrium quantum states and their dependence on initial conditions. The role of initial conditions is prominently highlighted by the existence of quantum many-body scars, special athermal states with an underlying effective superspin structure, embedded in an otherwise chaotic many-body spectrum. Spin Heisenberg and $XXZ$ models and their variants in one and higher dimension have been shown to host exact quantum many-body scars, exhibiting perfect revivals of spin helix states that are realizable in synthetic and condensed matter systems. Motivated by these advances, we propose experimentally accessible, local, time-dependent protocols to explore the spatial thermalization profile and highlight how different parts of the system thermalize and affect the fate of the superspin. We identify distinct parametric regimes for the ferromagnetic ($X$-polarized) initial state based on the interplay between the driven spin and the rest, including local athermal behavior where the driven spin effectively decouples, acting like a ``cold" spot while being instrumental in heating up the other spins. We also identify parameter regimes where the superspin remains resilient to local driving for long time scales. We develop a real and Floquet space picture that explains our numerical observations, and make predictions that can be tested in various experimental setups.	公開日:2024-09-26 翻訳日:2024-11-09 15:24:36
# インコンテクスト学習に関する調査研究 A Survey on In-context Learning ( http://arxiv.org/abs/2301.00234v5 ) ライセンス: Link先を確認	Qingxiu Dong, Lei Li, Damai Dai, Ce Zheng, Jingyuan Ma, Rui Li, Heming Xia, Jingjing Xu, Zhiyong Wu, Tianyu Liu, Baobao Chang, Xu Sun, Lei Li, Zhifang Sui,	(参考訳) 大規模言語モデル(LLM)の能力の増大に伴い、インコンテキスト学習(ICL)は自然言語処理(NLP)の新しいパラダイムとして登場し、LLMはいくつかの例で拡張されたコンテキストに基づいて予測を行う。 ICLを探索してLLMの能力を評価・外挿する重要な傾向である。本稿では,ICLの進歩と課題を概観し,整理することを目的とする。まず、ICLの形式的定義を示し、関連する研究との相関を明らかにする。そこで我々は,訓練戦略,迅速な設計戦略,関連する分析など,高度な手法を整理し,議論する。さらに、データエンジニアリングや知識更新など、さまざまなICLアプリケーションシナリオについても検討する。最後に、ICLの課題に対処し、さらなる研究の方向性を提案する。 ICLがどのように機能し、ICLを改善するかについて、私たちの研究がより深く研究されることを願っています。 With the increasing capabilities of large language models (LLMs), in-context learning (ICL) has emerged as a new paradigm for natural language processing (NLP), where LLMs make predictions based on contexts augmented with a few examples. It has been a significant trend to explore ICL to evaluate and extrapolate the ability of LLMs. In this paper, we aim to survey and summarize the progress and challenges of ICL. We first present a formal definition of ICL and clarify its correlation to related studies. Then, we organize and discuss advanced techniques, including training strategies, prompt designing strategies, and related analysis. Additionally, we explore various ICL application scenarios, such as data engineering and knowledge updating. Finally, we address the challenges of ICL and suggest potential directions for further research. We hope that our work can encourage more research on uncovering how ICL works and improving ICL.	公開日:2024-10-05 翻訳日:2024-11-09 15:24:36
# ハニーポットデータにおける教師なし攻撃パターン検出のためのネストディリクレモデル Nested Dirichlet models for unsupervised attack pattern detection in honeypot data ( http://arxiv.org/abs/2301.02505v3 ) ライセンス: Link先を確認	Francesco Sanna Passino, Anastasia Mantziou, Daniyar Ghani, Philip Thiede, Ross Bevington, Nicholas A. Heard,	(参考訳) サイバーシステムは侵入の試みからほぼ一貫した脅威にさらされている。攻撃の種類は異なるが、それぞれの試みは典型的には特定の意図を持ち、加害者は典型的には同様の目的を持った個人のグループである。共通の意図を共有しているように見えるクラスタリング攻撃は、脅威追跡の専門家にとって非常に価値がある。本稿では、悪意のある攻撃者を誘惑するように設計された特別なネットワークホストであるハニーポットから収集した端末セッションコマンドをクラスタリングするためのディリクレ分布トピックモデルについて検討する。セッションをクラスタリングする主な実践的意味は2つある。様々な統計モデルが検討され、コマンドライン構文の構造に適応している。特に、セカンダリトピックとセカンダリトピックの概念、そしてセッションレベルおよびコマンドレベルトピックの概念が、解釈可能性を改善するためにモデルに導入される。提案手法はさらにベイズ的非パラメトリックな方法で拡張され、語彙サイズと潜在意図数の非有界性を許容する。これらの手法は、従来のトピックモデリングアプローチでは検出されていない、既存の暗号通貨のコインマイニングインフラを乗っ取ろうとする、珍しいMIRAI変異を発見している。 Cyber-systems are under near-constant threat from intrusion attempts. Attacks types vary, but each attempt typically has a specific underlying intent, and the perpetrators are typically groups of individuals with similar objectives. Clustering attacks appearing to share a common intent is very valuable to threat-hunting experts. This article explores Dirichlet distribution topic models for clustering terminal session commands collected from honeypots, which are special network hosts designed to entice malicious attackers. The main practical implications of clustering the sessions are two-fold: finding similar groups of attacks, and identifying outliers. A range of statistical models are considered, adapted to the structures of command-line syntax. In particular, concepts of primary and secondary topics, and then session-level and command-level topics, are introduced into the models to improve interpretability. The proposed methods are further extended in a Bayesian nonparametric fashion to allow unboundedness in the vocabulary size and the number of latent intents. The methods are shown to discover an unusual MIRAI variant which attempts to take over existing cryptocurrency coin-mining infrastructure, not detected by traditional topic-modelling approaches.	公開日:2024-09-22 翻訳日:2024-11-09 15:24:36
# 授業増分学習における効果的な意思決定境界学習 Effective Decision Boundary Learning for Class Incremental Learning ( http://arxiv.org/abs/2301.05180v4 ) ライセンス: Link先を確認	Kunchi Li, Jun Wan, Shan Yu,	(参考訳) クラスインクリメンタルラーニング(CIL)におけるリハーサルアプローチは、知識蒸留のための古いクラスデータの不足と、記憶メモリが限られているため、学習と新しいクラス間の不均衡なデータ学習という2つの要因によって、新しいクラスに過度に適合する決定境界に悩まされる。本研究では,これらの2つの要因に対処するための,単純かつ効果的なアプローチを提案する。まず、再サンプリング戦略とMixup K {\displaystyle K}nowledge D}istillation (Re-MKD)を用いて、KDの性能を改善する。具体的には、学習されたクラスと新しいクラス間の潜伏分布とより整合したKDトレーニングで使用される適切なデータを合成するために、ミックスアップと再サンプリングの戦略を組み合わせる。次に, インフルエンスバランス法をCIL設定に拡張することにより, インクリメンタルインフルエンスバランス(IIB)法を提案する。これら2つの改善により、KDの性能を改善し、不均衡なデータ学習を同時に扱う効果的な決定境界学習アルゴリズム(EDBL)を提案する。実験の結果、EDBLはいくつかのCILベンチマークで最先端のパフォーマンスを達成できた。 Rehearsal approaches in class incremental learning (CIL) suffer from decision boundary overfitting to new classes, which is mainly caused by two factors: insufficiency of old classes data for knowledge distillation and imbalanced data learning between the learned and new classes because of the limited storage memory. In this work, we present a simple but effective approach to tackle these two factors. First, we employ a re-sampling strategy and Mixup K}nowledge D}istillation (Re-MKD) to improve the performances of KD, which would greatly alleviate the overfitting problem. Specifically, we combine mixup and re-sampling strategies to synthesize adequate data used in KD training that are more consistent with the latent distribution between the learned and new classes. Second, we propose a novel incremental influence balance (IIB) method for CIL to tackle the classification of imbalanced data by extending the influence balance method into the CIL setting, which re-weights samples by their influences to create a proper decision boundary. With these two improvements, we present the effective decision boundary learning algorithm (EDBL) which improves the performance of KD and deals with the imbalanced data learning simultaneously. Experiments show that the proposed EDBL achieves state-of-the-art performances on several CIL benchmarks.	公開日:2024-09-26 翻訳日:2024-11-09 15:24:36
# 表面マイニングにおける自動化とAI技術 -Pilbaraにおけるオープンピット操作の簡単な紹介- Automation and AI Technology in Surface Mining With a Brief Introduction to Open-Pit Operations in the Pilbara ( http://arxiv.org/abs/2301.09771v6 ) ライセンス: Link先を確認	Raymond Leung, Andrew J Hill, Arman Melkumyan,	(参考訳) 本稿では,鉱業,特に西オーストラリアのピルバラ鉄鉱地帯で発生した工学的問題,技術革新,ロボット開発,自動化の取り組みについて概説する。目標は、テクノロジの展望を描き、エンジニアリングのオーディエンスに関連する課題を強調して、AIに対する認識を高め、マイニングにおける自動化のトレンドを高めることだ。これは、読者が鉱業に関する事前の知識を持っていないと仮定し、共通の露天掘り鉱業に関する議論と短い要約を通じて、徐々に文脈を構築していく。主な活動は、資源開発、鉱業、鉄道、港湾業の分野に分類される。鉱物探査から鉱石の出荷まで、この間にはおよそ9つの段階がある。地質学的アセスメント、鉱山計画と開発、生産の掘削と調査、爆破と掘削、鉱石と廃棄物の輸送、解体とスクリーン、ストックパイルとロードアウト、鉄道網の流通、および鉱石車ダンピングなどである。目的は、これらのプロセスを説明し、10年にわたる産業大学と研究開発のパートナーシップの観点から、課題/機会のいくつかについて洞察を提供することである。 This survey article provides a synopsis on some of the engineering problems, technological innovations, robotic development and automation efforts encountered in the mining industry -- particularly in the Pilbara iron-ore region of Western Australia. The goal is to paint the technology landscape and highlight issues relevant to an engineering audience to raise awareness of AI and automation trends in mining. It assumes the reader has no prior knowledge of mining and builds context gradually through focused discussion and short summaries of common open-pit mining operations. The principal activities that take place may be categorized in terms of resource development, mine-, rail- and port operations. From mineral exploration to ore shipment, there are roughly nine steps in between. These include: geological assessment, mine planning and development, production drilling and assaying, blasting and excavation, transportation of ore and waste, crush and screen, stockpile and load-out, rail network distribution, and ore-car dumping. The objective is to describe these processes and provide insights on some of the challenges/opportunities from the perspective of a decade-long industry-university R&D partnership.	公開日:2024-09-27 翻訳日:2024-11-09 15:24:36
# 単軌道分布ロバスト強化学習 Single-Trajectory Distributionally Robust Reinforcement Learning ( http://arxiv.org/abs/2301.11721v2 ) ライセンス: Link先を確認	Zhipeng Liang, Xiaoteng Ma, Jose Blanchet, Jiheng Zhang, Zhengyuan Zhou,	(参考訳) 古典的強化学習(RL)フレームワークが同一のトレーニング環境とテスト環境に大きく依存する限界を軽減するため、分散ロバストRL(DRRL)は、おそらく未知のテスト環境を含む様々な環境のパフォーマンスを高めるために提案されている。ロバスト性ゲインの価格として、DRRLは一連の分布を最適化するが、これは本質的に非ロバストな場合の固定分布を最適化するよりも難しい。既存のDRRLアルゴリズムはモデルベースか、1つのサンプル軌道から学習できないかのいずれかである。本稿では,分散ロバストなQ-ラーニング(DRQ)と呼ばれる,完全モデルフリーなDRRLアルゴリズムを設計する。本研究では,各サンプルを段階的に活用するマルチタイム・フレームワークを微妙に設計し,環境をモデル化せずに最適な分散ロバストなポリシーを直接学習する。アルゴリズムの複雑さにもかかわらず、古典確率近似ツールを一般化することにより漸近収束を保証する。総合的な実験結果から,提案アルゴリズムの頑健性やサンプルの複雑さは,非ロバストな手法や他のロバストなRLアルゴリズムと比較して優れていることが示された。 To mitigate the limitation that the classical reinforcement learning (RL) framework heavily relies on identical training and test environments, Distributionally Robust RL (DRRL) has been proposed to enhance performance across a range of environments, possibly including unknown test environments. As a price for robustness gain, DRRL involves optimizing over a set of distributions, which is inherently more challenging than optimizing over a fixed distribution in the non-robust case. Existing DRRL algorithms are either model-based or fail to learn from a single sample trajectory. In this paper, we design a first fully model-free DRRL algorithm, called distributionally robust Q-learning with single trajectory (DRQ). We delicately design a multi-timescale framework to fully utilize each incrementally arriving sample and directly learn the optimal distributionally robust policy without modelling the environment, thus the algorithm can be trained along a single trajectory in a model-free fashion. Despite the algorithm's complexity, we provide asymptotic convergence guarantees by generalizing classical stochastic approximation tools. Comprehensive experimental results demonstrate the superior robustness and sample complexity of our proposed algorithm, compared to non-robust methods and other robust RL algorithms.	公開日:2024-09-21 翻訳日:2024-11-09 15:24:36
# 位相遷移を厳密に探究する普遍記号を定義する Defining a universal sign to strictly probe a phase transition ( http://arxiv.org/abs/2301.12438v4 ) ライセンス: Link先を確認	Nvsen Ma, Jun-Song Sun, Gaopei Pan, Chen Cheng, Zheng Yan,	(参考訳) 量子モンテカルロシミュレーションにおける悪名高い符号問題の謎は、フェルミオン系およびフラストレーション系における手法の適用を効果的に制限している。最近の研究 (Science 375, 418 (2022)) では, 相転移の探索に符号を使用できることを指摘し, 符号問題において顕著なブレークスルーをおこなった。本研究では,符号問題と位相遷移が常に厳密に関連付けられないことを示すために,原点と参照系の間の自由エネルギーの差に関連する符号の定義に基づく一般論を提案した。符号は、基準系の自由エネルギーが変数パラメータの下で平坦である場合にのみ、位相遷移を正確にプローブすることができるが、設計はほぼ不可能である。一般に、記号が位相遷移を探索できるという結論は、普遍性のない生存バイアスである。この問題を解決するために,参照システムの影響を排除し,位相遷移を厳密に探索する修正符号を定義する。この研究は、新しい修飾符号によって相転移を検出する不偏解を与える。 The mystery of the infamous sign problem in quantum Monte Carlo simulations mightily restricts applications of the method in fermionic and frustrated systems. A recent work [Science 375, 418 (2022)] made a remarkable breakthrough in the sign problem by pointing out that the sign can be used to probe phase transition. In this work, we proposed a general argument based on the definition of the sign that is related to the difference in free energy between the original and reference systems to clarify that the sign problem and phase transition cannot always be strictly related. The sign can exactly probe phase transition only if the free energy in the reference system is flat under variable parameters, which is almost impossible to design. Generally speaking, the conclusion that the sign can probe phase transition is survivorship bias without universality. To solve this problem, we define a modified sign that excludes the influence of the reference system, which can probe the phase transition strictly. The work gives an unbiased solution for detecting phase transition by the new modified sign.	公開日:2024-09-26 翻訳日:2024-11-09 15:24:36
# W2SAT: 軽量リテラルインシデンスグラフからSATインスタンスを生成する学習 W2SAT: Learning to generate SAT instances from Weighted Literal Incidence Graphs ( http://arxiv.org/abs/2302.00272v2 ) ライセンス: Link先を確認	Weihuang Wen, Tianshu Yu,	(参考訳) ブール満足度(SAT)問題は理論計算機科学において魅力的なNP完全問題であり、幅広いコンピューティング関連アプリケーションにおいて中心的な役割を果たす。多くのシナリオ下でSATソルバの爆発とチューニングを行うには、非常に高品質なSATインスタンスが必要である。そこで本論文では,実世界の実物/産業のインスタンスから本質的な構造と特性を暗黙的に学習し,SAT式を生成するフレームワークであるW2SATを提案する。この目的のために我々は,既存の表現能力と一般化性を示す新たなSAT表現であるWeighted Literal Incidence Graph (WLIG)を導入し,特殊学習に基づくグラフ生成モデルを用いて効率的に生成することができる。 WLIGからSAT問題へのデコーディングは、新しい丘登り最適化手法であるOWC(Optimal Weight Coverage)で重なり合う斜めの発見としてモデル化される。実験では,従来の手法と比較して,グラフメトリクス,効率,拡張性の観点からWLIGによるアプローチの優位性を示す。さらに、実世界のアプリケーションにおけるグラフベースのSAT生成の限界、特にSATソルバパラメータチューニングのために生成されたインスタンスを利用する場合について論じ、潜在的な方向を示す。 The Boolean Satisfiability (SAT) problem stands out as an attractive NP-complete problem in theoretic computer science and plays a central role in a broad spectrum of computing-related applications. Exploiting and tuning SAT solvers under numerous scenarios require massive high-quality industry-level SAT instances, which unfortunately are quite limited in the real world. To address the data insufficiency issue, in this paper, we propose W2SAT, a framework to generate SAT formulas by learning intrinsic structures and properties from given real-world/industrial instances in an implicit fashion. To this end, we introduce a novel SAT representation called Weighted Literal Incidence Graph (WLIG), which exhibits strong representation ability and generalizability against existing counterparts, and can be efficiently generated via a specialized learning-based graph generative model. Decoding from WLIGs into SAT problems is then modeled as finding overlapping cliques with a novel hill-climbing optimization method termed Optimal Weight Coverage (OWC). Experiments demonstrate the superiority of our WLIG-induced approach in terms of graph metrics, efficiency, and scalability in comparison to previous methods. Additionally, we discuss the limitations of graph-based SAT generation for real-world applications, especially when utilizing generated instances for SAT solver parameter-tuning, and pose some potential directions.	公開日:2024-09-24 翻訳日:2024-11-09 15:24:36
# Wasserstein距離におけるロバスト推定 Robust Estimation under the Wasserstein Distance ( http://arxiv.org/abs/2302.01237v2 ) ライセンス: Link先を確認	Sloan Nietert, Rachel Cummings, Ziv Goldfeld,	(参考訳) 本稿では、最適輸送(OT)理論に根ざした確率分布間の一般的な相違尺度であるワッサーシュタイン距離の下でのロバスト分布推定の問題について検討する。未知分布の$\mu$から$n$のサンプルが与えられたとき、$\varepsilon n$は逆向きに破損するので、最小のワッサーシュタイン誤差を持つ$\mu$の見積もりを求める。この課題に対処するために, OT とロバスト統計学の2つのフレームワーク, 部分 OT (POT) と最小距離推定 (MDE) について考察した。我々はPOTの新たな構造特性を証明し、それを用いて、部分的なワッサーシュタイン距離のMDEが、多くの設定において最小最適ロバストな推定リスクを達成することを示す。その過程で、標準的なOTに対して古典的カントロビッチ双対に超ノルムのペナルティを加えるPOTの新しい双対形式を導出する。一般的なWGAN(Warsserstein Generative Adversarial Network)フレームワークは,カンポロビッチ双対性を介してWasserstein MDEを実装しているため,我々のペナル化双対は,WGANに基本的な修正を加えて,汚染データセットを用いた大規模生成モデリングを可能にする。敵の汚職の影響を緩和する手法の有効性を実証する数値実験を行った。 We study the problem of robust distribution estimation under the Wasserstein distance, a popular discrepancy measure between probability distributions rooted in optimal transport (OT) theory. Given $n$ samples from an unknown distribution $\mu$, of which $\varepsilon n$ are adversarially corrupted, we seek an estimate for $\mu$ with minimal Wasserstein error. To address this task, we draw upon two frameworks from OT and robust statistics: partial OT (POT) and minimum distance estimation (MDE). We prove new structural properties for POT and use them to show that MDE under a partial Wasserstein distance achieves the minimax-optimal robust estimation risk in many settings. Along the way, we derive a novel dual form for POT that adds a sup-norm penalty to the classic Kantorovich dual for standard OT. Since the popular Wasserstein generative adversarial network (WGAN) framework implements Wasserstein MDE via Kantorovich duality, our penalized dual enables large-scale generative modeling with contaminated datasets via an elementary modification to WGAN. Numerical experiments demonstrating the efficacy of our approach in mitigating the impact of adversarial corruptions are provided.	公開日:2024-09-24 翻訳日:2024-11-09 15:24:36
# 適応的データ分析のためのサブサンプリング手法 Subsampling Suffices for Adaptive Data Analysis ( http://arxiv.org/abs/2302.08661v3 ) ライセンス: Link先を確認	Guy Blanc,	(参考訳) データセットで行った分析が全人口を代表することを保証することは、統計学における中心的な問題の一つである。ほとんどの古典的なテクニックは、データセットがアナリストのクエリとは独立していると仮定し、データセットが複数の適応的に選択されたクエリのために再利用される一般的な設定に分解する。このemph{adaptive data analysis} の問題は、Dwork et al (STOC, 2015) と Hardt and Ullman (FOCS, 2014) のセミナーで定式化された。クエリが適応的に選択されたとしても、クエリが表現され続けるという、非常に単純な仮定のセットを特定します。この結果は,サブサンプリングに固有のノイズが,クエリ応答の一般化を保証するのに十分であることを示している。このサブサンプルベースのフレームワークの単純さにより、以前の作業でカバーされていないさまざまな現実世界のシナリオをモデル化することができる。その単純さに加えて、統計的クエリと中央値探索という2つの基本的なタスクのメカニズムを設計することで、このフレームワークの有用性を実証する。特に、広く適用可能な統計クエリのクラスに答えるメカニズムは、多くのパラメーターレシエーションにおいて非常に単純かつ最先端である。 Ensuring that analyses performed on a dataset are representative of the entire population is one of the central problems in statistics. Most classical techniques assume that the dataset is independent of the analyst's query and break down in the common setting where a dataset is reused for multiple, adaptively chosen, queries. This problem of \emph{adaptive data analysis} was formalized in the seminal works of Dwork et al. (STOC, 2015) and Hardt and Ullman (FOCS, 2014). We identify a remarkably simple set of assumptions under which the queries will continue to be representative even when chosen adaptively: The only requirements are that each query takes as input a random subsample and outputs few bits. This result shows that the noise inherent in subsampling is sufficient to guarantee that query responses generalize. The simplicity of this subsampling-based framework allows it to model a variety of real-world scenarios not covered by prior work. In addition to its simplicity, we demonstrate the utility of this framework by designing mechanisms for two foundational tasks, statistical queries and median finding. In particular, our mechanism for answering the broadly applicable class of statistical queries is both extremely simple and state of the art in many parameter regimes.	公開日:2024-09-24 翻訳日:2024-11-09 15:24:36
# 視覚変換器の効率的な知識蒸留におけるマスキングの役割 The Role of Masking for Efficient Supervised Knowledge Distillation of Vision Transformers ( http://arxiv.org/abs/2302.10494v4 ) ライセンス: Link先を確認	Seungwoo Son, Jegwang Ryu, Namhoon Lee, Jaeho Lee,	(参考訳) 知識蒸留は、軽量視覚モデルの訓練に有効な方法である。しかし、特に視覚変換器(ViT)のような大規模モデルでは、トレーニングサンプルの教師監督を取得するのにコストがかかることが多い。本稿では,ViT蒸留の監督コストを削減するための簡易な枠組みを開発し,教師に与えられた少量の入力トークンを隠蔽する。入力トークンをマスキングすることで、教師のパラメータやアーキテクチャを変更することなく、マスクされたトークンに関連する計算をスキップすることができる。学生の注意点が最も低いマスキングパッチは、学生の精度を低下させることなく、教師のFLOPの最大50%を節約し、他のマスキング基準は、最適以下の効率向上をもたらす。より詳細な分析により,学生が指導するマスキングが学生に良いカリキュラムを提供することが明らかとなり,教師の指導が早い段階で容易に受けられるようになり,後半の課題も解決できた。 Knowledge distillation is an effective method for training lightweight vision models. However, acquiring teacher supervision for training samples is often costly, especially from large-scale models like vision transformers (ViTs). In this paper, we develop a simple framework to reduce the supervision cost of ViT distillation: masking out a fraction of input tokens given to the teacher. By masking input tokens, one can skip the computations associated with the masked tokens without requiring any change to teacher parameters or architecture. We find that masking patches with the lowest student attention scores is highly effective, saving up to 50% of teacher FLOPs without any drop in student accuracy, while other masking criterion leads to suboptimal efficiency gains. Through in-depth analyses, we reveal that the student-guided masking provides a good curriculum to the student, making teacher supervision easier to follow during the early stage and challenging in the later stage.	公開日:2024-09-27 翻訳日:2024-11-09 15:24:36
# ベイズ行列分解とその応用 Bayesian Matrix Decomposition and Applications ( http://arxiv.org/abs/2302.11337v3 ) ライセンス: Link先を確認	Jun Lu,	(参考訳) 本書の唯一の目的は、行列分解技法をシームレスに導入するために、ベイズ行列分解における概念と数学的ツールを自己完結的に導入することである。しかし、ベイズ行列の分解に関する有用かつ興味深い結果をすべてカバーできないことは明らかであり、最適化を行うための変分推論の分離解析を例に挙げる。ベイズ解析の分野における文献を参照し、関連する分野についてより詳細な解説を行う。この本は、主に目的、重要なベイズ行列分解法、例えば実数値分解、非負行列分解、ベイズ補間分解、およびそれらの応用に光を当てた方法の起源と複雑さの要約である。数学の前提条件は統計学と線型代数の最初のコースである。この控えめな背景以外は、開発は自己完結しており、厳密な証明が提供される。 The sole aim of this book is to give a self-contained introduction to concepts and mathematical tools in Bayesian matrix decomposition in order to seamlessly introduce matrix decomposition techniques and their applications in subsequent sections. However, we clearly realize our inability to cover all the useful and interesting results concerning Bayesian matrix decomposition and given the paucity of scope to present this discussion, e.g., the separated analysis of variational inference for conducting the optimization. We refer the reader to literature in the field of Bayesian analysis for a more detailed introduction to the related fields. This book is primarily a summary of purpose, significance of important Bayesian matrix decomposition methods, e.g., real-valued decomposition, nonnegative matrix factorization, Bayesian interpolative decomposition, and the origin and complexity of the methods which shed light on their applications. The mathematical prerequisite is a first course in statistics and linear algebra. Other than this modest background, the development is self-contained, with rigorous proof provided throughout.	公開日:2024-09-26 翻訳日:2024-11-09 15:24:36
# 相互作用する2つのコールド極性分子の回転特性:線形、対称、非対称トップ Rotational properties of two interacting cold polar molecules: linear, symmetric, and asymmetric tops ( http://arxiv.org/abs/2303.02199v2 ) ライセンス: Link先を確認	Felipe Isaule, Robert Bennett, Jörg B. Götte,	(参考訳) 我々は、外部dc電場と異方性双極子-双極子相互作用の影響下で、2つの静極分子のポテンシャル-エネルギー曲線と双極子モーメントの偏極について検討した。分子を量子剛性ローターとしてモデル化し、その自由度を考慮し、線形、対称、非対称のトップ分子の選択を考える。電界の分子間分離と方向の異なる双極子のエネルギー曲線と偏極の総合的な検討を行い、分子の性質が短距離分離において磁場の方向に強く依存していることを見出した。後者は、分子双極子気体の自転自由度を説明できる可能性についての洞察を与える。 We examine the potential-energy curves and polarization of the dipole moments of two static polar molecules under the influence of an external dc electric field and their anisotropic dipole-dipole interaction. We model the molecules as quantum rigid rotors to take their rotational degrees of freedom into account and consider a selection of linear, symmetric, and asymmetric top molecules. We provide a comprehensive examination of the energy curves and polarization of the dipoles for varying inter-molecular separation and direction of the electric field and find that the properties of the molecules depend strongly on the field's direction at short separations, showing the importance of accounting for molecular rotation. The latter provides insight into the possible effects of accounting for rotational degrees of freedom in molecular dipolar gases.	公開日:2024-09-23 翻訳日:2024-11-09 15:24:36
# 分割共形予測における経験的カバレッジの普遍的分布 Universal distribution of the empirical coverage in split conformal prediction ( http://arxiv.org/abs/2303.02770v2 ) ライセンス: Link先を確認	Paulo C. Marques F,	(参考訳) スプリット共形予測が交換可能なデータでバッチモードで動作する場合、将来の観測可能量の有限バッチに対して生成された予測セットの実験的カバレッジの正確な分布と、バッチサイズが無限大になるときにそのほぼ確実な限界の正確な分布を決定する。どちらの分布も普遍的であり、名前付きミスカバーレベルとキャリブレーションサンプルサイズのみによって決定されるため、アプリケーションで必要最小限のキャリブレーションサンプルサイズを選択するための基準が確立される。 When split conformal prediction operates in batch mode with exchangeable data, we determine the exact distribution of the empirical coverage of prediction sets produced for a finite batch of future observables, as well as the exact distribution of its almost sure limit when the batch size goes to infinity. Both distributions are universal, being determined solely by the nominal miscoverage level and the calibration sample size, thereby establishing a criterion for choosing the minimum required calibration sample size in applications.	公開日:2024-09-21 翻訳日:2024-11-09 15:24:36
# 審美的不確実性のモデル化のための確率的統一関係--定理証明による意味論と自動推論 Probabilistic unifying relations for modelling epistemic and aleatoric uncertainty: semantics and automated reasoning with theorem proving ( http://arxiv.org/abs/2303.09692v3 ) ライセンス: Link先を確認	Kangfeng Ye, Jim Woodcock, Simon Foster,	(参考訳) 確率的プログラミングは、一般的なコンピュータプログラミング、統計的推論、フォーマルセマンティクスを組み合わせて、不確実性に直面した時にシステムが決定を下すのを助ける。確率的プログラムはユビキタスであり、マシンインテリジェンスに大きな影響を与えている。多くの確率的アルゴリズムは、実際には異なる領域で使われているが、形式的意味論に基づく自動検証は、まだ比較的新しい研究分野である。過去20年間、多くの関心を集めてきた。しかし、多くの課題が残っている。本稿では,確率的統一関係(ProbURel)について述べる。私たちの仕事は、Hehner氏の予測確率的プログラミングに基づいていますが、彼の仕事が広く採用されるにはいくつかの障害があります。ここでのコントリビューションは,(1)Iverson Bracket表記を算術と区別するために導入した文法と意味論の形式化,(2)Unified Theories of Programming(UTP)を用いた関係の形式化,(3)実数の位相空間上の和を用いたブラケット外の確率化,(3)Kleeneの固定点定理を用いた確率ループの構成的意味論,(4)構成的意味論を扱うための分布から部分分布へのセマンティクスと超分布へのセマンティクスの強化,(5)確率ループの推論を単純化するための一意的不動点定理,(6)Isabelle/UTPにおける理論の機械化,そして(6)Isabel/UTTP/HOLにおける実装。ロボットのローカライゼーションの問題,機械学習の分類,確率ループの終了など,6つの事例で研究成果を実演する。 Probabilistic programming combines general computer programming, statistical inference, and formal semantics to help systems make decisions when facing uncertainty. Probabilistic programs are ubiquitous, including having a significant impact on machine intelligence. While many probabilistic algorithms have been used in practice in different domains, their automated verification based on formal semantics is still a relatively new research area. In the last two decades, it has attracted much interest. Many challenges, however, remain. The work presented in this paper, probabilistic unifying relations (ProbURel), takes a step towards our vision to tackle these challenges. Our work is based on Hehner's predicative probabilistic programming, but there are several obstacles to the broader adoption of his work. Our contributions here include (1) the formalisation of its syntax and semantics by introducing an Iverson bracket notation to separate relations from arithmetic; (2) the formalisation of relations using Unifying Theories of Programming (UTP) and probabilities outside the brackets using summation over the topological space of the real numbers; (3) the constructive semantics for probabilistic loops using Kleene's fixed-point theorem; (4) the enrichment of its semantics from distributions to subdistributions and superdistributions to deal with the constructive semantics; (5) the unique fixed-point theorem to simplify the reasoning about probabilistic loops; and (6) the mechanisation of our theory in Isabelle/UTP, an implementation of UTP in Isabelle/HOL, for automated reasoning using theorem proving. We demonstrate our work with six examples, including problems in robot localisation, classification in machine learning, and the termination of probabilistic loops.	公開日:2024-09-26 翻訳日:2024-11-09 15:24:36
# 画像付きマルチモーダルシャノンゲーム Multimodal Shannon Game with Images ( http://arxiv.org/abs/2303.11192v2 ) ライセンス: Link先を確認	Vilém Zouhar, Sunit Bhattacharya, Ondřej Bojar,	(参考訳) シャノンゲームは長年、言語学やNLPにおける思考実験として使われており、参加者に、前の文脈に基づいて次の文字を推測するよう求めてきた。画像情報の形式でオプションの余分なモダリティを導入することで、ゲームを拡張します。本ゲームにおけるマルチモーダル情報の影響を調べるため,人間と言語モデル(LM, GPT-2)を用いた。画像情報の追加により、人間とLMの両方の自己報告された信頼度と精度が向上することを示す。名詞や決定子などの一部の単語クラスは、追加のモダリティ情報から恩恵を受ける。ヒトとLMの双方のプライミング効果は、文脈サイズが増加するにつれてより明らかになる。これらの知見は、言語理解とモデリングを改善するためのマルチモーダル情報の可能性を強調している。 The Shannon game has long been used as a thought experiment in linguistics and NLP, asking participants to guess the next letter in a sentence based on its preceding context. We extend the game by introducing an optional extra modality in the form of image information. To investigate the impact of multimodal information in this game, we use human participants and a language model (LM, GPT-2). We show that the addition of image information improves both self-reported confidence and accuracy for both humans and LM. Certain word classes, such as nouns and determiners, benefit more from the additional modality information. The priming effect in both humans and the LM becomes more apparent as the context size (extra modality information + sentence context) increases. These findings highlight the potential of multimodal information in improving language understanding and modeling.	公開日:2024-09-27 翻訳日:2024-11-09 15:24:36
# CompoNeRF:編集可能な3Dシーンレイアウトによるテキスト誘導多目的合成型NeRF CompoNeRF: Text-guided Multi-object Compositional NeRF with Editable 3D Scene Layout ( http://arxiv.org/abs/2303.13843v5 ) ライセンス: Link先を確認	Haotian Bai, Yuanhuiyi Lyu, Lutao Jiang, Sijia Li, Haonan Lu, Xiaodong Lin, Lin Wang,	(参考訳) テキストから3Dの形式は、AR/VRのための編集可能な3Dシーンを作成する上で重要な役割を果たす。最近の進歩は、テキストから3Dオブジェクト生成のための事前訓練された拡散モデルとニューラルラジアンス場(NeRF)を融合させる可能性を示している。しかし、永続的な課題のひとつは、一貫性のあるマルチオブジェクト環境を正確に解析し再生する能力が不十分であることだ。特に、これらのモデルは、多目的テキストによって引き起こされる量やスタイルを正確に表現することが困難であり、しばしば、意味的な複雑さにマッチしないレンダリングの忠実度が崩壊する。さらに、これらの要素をコヒーレントな3Dシーンにアマルゲイトすることは、拡散モデルに固有の一般的な分布から生じる、重大な課題である。そこで我々は,「誘導崩壊」の問題に対処し,さらにシーンの一貫性を高めるために,編集可能な3Dシーンレイアウトとオブジェクト固有およびシーンワイドガイダンス機構を統合することで,CompoNeRFと呼ばれる新しいフレームワークを提案する。複雑なテキストを複数のNeRFで区切られたレイアウトに解釈し、それぞれが対応するサブテキストプロンプトとペアになって、正確なオブジェクトの描写を行う。次に、調整された合成モジュールがこれらのNeRFをシームレスにブレンドし、一貫性を促進し、二重レベルテキストガイダンスは曖昧さを低減し、精度を高める。特に、我々の構成設計では分解が可能である。これにより、編集されたレイアウトやテキストプロンプトに基づいてフレキシブルなシーン編集と新しいシーンへの再構成が可能になる。オープンソースの安定拡散モデルを用いて、CompoNeRFは高忠実度な多目的シーンを生成する。注目すべきは、このフレームワークはマルチビューCLIPスコア測定により、最大で \textbf{54\%} の改善を実現している点である。提案手法は,多目的シーン生成のための意味的精度,多視点一貫性,個人認識性を大幅に向上したことを示す。 Text-to-3D form plays a crucial role in creating editable 3D scenes for AR/VR. Recent advances have shown promise in merging neural radiance fields (NeRFs) with pre-trained diffusion models for text-to-3D object generation. However, one enduring challenge is their inadequate capability to accurately parse and regenerate consistent multi-object environments. Specifically, these models encounter difficulties in accurately representing quantity and style prompted by multi-object texts, often resulting in a collapse of the rendering fidelity that fails to match the semantic intricacies. Moreover, amalgamating these elements into a coherent 3D scene is a substantial challenge, stemming from generic distribution inherent in diffusion models. To tackle the issue of 'guidance collapse' and further enhance scene consistency, we propose a novel framework, dubbed CompoNeRF, by integrating an editable 3D scene layout with object-specific and scene-wide guidance mechanisms. It initiates by interpreting a complex text into the layout populated with multiple NeRFs, each paired with a corresponding subtext prompt for precise object depiction. Next, a tailored composition module seamlessly blends these NeRFs, promoting consistency, while the dual-level text guidance reduces ambiguity and boosts accuracy. Noticeably, our composition design permits decomposition. This enables flexible scene editing and recomposition into new scenes based on the edited layout or text prompts. Utilizing the open-source Stable Diffusion model, CompoNeRF generates multi-object scenes with high fidelity. Remarkably, our framework achieves up to a \textbf{54\%} improvement by the multi-view CLIP score metric. Our user study indicates that our method has significantly improved semantic accuracy, multi-view consistency, and individual recognizability for multi-object scene generation.	公開日:2024-09-24 翻訳日:2024-11-09 15:24:36
# データセットアーチタイプを用いた高レベル合成データ生成 High-Level Synthetic Data Generation with Data Set Archetypes ( http://arxiv.org/abs/2303.14301v3 ) ライセンス: Link先を確認	Michael J. Zellinger, Peter Bühlmann,	(参考訳) クラスタ分析は、異なるアルゴリズムの評価と比較に有効なベンチマークに依存している。クラスタ間の重なり合いやクラスタ形状の変化など,データセットの重要な特徴を効果的に変化させることができるため,合成データのシミュレーション研究が一般的である。残念ながら、評価シナリオのキュレートは、"全く異なる形状のクラスタ"のような高レベルのシナリオ記述と一致するように、実践者は(クラスタ共分散行列のような)低レベルの幾何学的パラメータを見つけなければならないため、しばしば困難である。ベンチマークをより便利かつ有益なものにするために,データセットのアーカイタイプに基づく合成データ生成を提案する。このパラダイムでは、ユーザは高いレベルの評価シナリオを記述し、ソフトウェアは所望の特性を持つデータセットを自動的に生成する。このようなデータセットのアーチタイプと大きな言語モデル(LLM)を組み合わせることで、評価シナリオの言語記述からベンチマークを純粋に設定することができる。このワークフローを実装したオープンソースのPythonパッケージであるreliclustを提供しています。音声入力からのデータ生成のデモはhttps://demo.repliclust.orgで公開されている。 Cluster analysis relies on effective benchmarks for evaluating and comparing different algorithms. Simulation studies on synthetic data are popular because important features of the data sets, such as the overlap between clusters, or the variation in cluster shapes, can be effectively varied. Unfortunately, curating evaluation scenarios is often laborious, as practitioners must find lower-level geometric parameters (like cluster covariance matrices) to match a higher-level scenario description like "clusters with very different shapes." To make benchmarks more convenient and informative, we propose synthetic data generation based on data set archetypes. In this paradigm, the user describes an evaluation scenario in a high-level manner, and the software automatically generates data sets with the desired characteristics. Combining such data set archetypes with large language models (LLMs), it is possible to set up benchmarks purely from verbal descriptions of the evaluation scenarios. We provide an open-source Python package, repliclust, that implements this workflow. A demo of data generation from verbal inputs is available at https://demo.repliclust.org.	公開日:2024-09-21 翻訳日:2024-11-09 15:24:36
# Cesno: 新しいプログラミング言語の初期設計 Cesno: The Initial Design of a New Programming Language ( http://arxiv.org/abs/2303.15750v4 ) ライセンス: Link先を確認	Ozelot Vanilla, Jingxiang Yu, Hemn Barzan Abdalla,	(参考訳) プログラミング言語は非常に多彩で、開発者は個々の要件に合ったアプリケーションやプログラムを作成できます。この記事では、高度でユーザフレンドリで使いやすいプログラミング環境を提供するためにゼロから設計された、Cesnoという新しい言語を紹介します。 Cesnoの構文は他の人気のある言語と似ているため、学習と作業が簡単になる。構文シュガー、組み込みライブラリ、関数型プログラミングのサポート、オブジェクト指向プログラミング、動的型付け、型システム、さまざまな関数パラメータと制約など、他の言語の機能が含まれている。この記事では、Cesnoの文法の設計について検討し、Cesnoがどのようにコードを処理し、コンパイルするかを概観し、Cesnoのコードがどのようなもので、どのように開発に役立てるかを検証します。 Programming languages are incredibly versatile, enabling developers to create applications and programs that suit their individual requirements. This article introduces a new language called Cesno, designed from the ground up to offer an advanced, user-friendly, and easy-to-use programming environment. Cesno's syntax is similar to other popular languages, making it simple to learn and work with. It incorporates features from other languages, such as syntactic sugar, a built-in library, support for functional programming, object-oriented program-ming, dynamic typing, a type system, and a variety of function parameters and restrictions. This article will explore the design of Cesno's grammar, provide a brief overview of how Cesno processes and compiles code, and provide exam-ples of what Cesno's code looks like and how it can aid in development.	公開日:2024-09-22 翻訳日:2024-11-09 15:24:36
# テンソルネットを用いた量子フーリエ変換のシミュレーション、グローバーのアルゴリズム、および限定絡み付き量子カウントアルゴリズム Simulating the quantum Fourier transform, Grover's algorithm, and the quantum counting algorithm with limited entanglement using tensor-networks ( http://arxiv.org/abs/2304.01751v2 ) ライセンス: Link先を確認	Marcel Niedermeier, Jose L. Lado, Christian Flindt,	(参考訳) 量子アルゴリズムは、計算問題を大きなヒルベルト空間における量子進化として再構成する。ほとんどの量子アルゴリズムは、時間進化は完全にユニタリであり、完全なヒルベルト空間が利用できると仮定する。しかし実際には、利用可能な絡み合いは限られており、量子アルゴリズムの忠実度は低下する。量子回路の絡み合いを制限できるため、量子アルゴリズムの実行を限定的にシミュレートするため、テンソルネットワーク法は有用なフレームワークを提供する。そこで本研究では,量子フーリエ変換,グロバーのアルゴリズム,および量子カウントアルゴリズムのエンタングルメントが減少するにつれて,テンソルネットワークを用いて量子フーリエ変換の忠実度を解析し,各アルゴリズムの実行時に発生するエンタングルメントをマッピングする。いずれの場合も,絡み合いが幾分小さくても,アルゴリズムは高い忠実度で実行可能であることがわかった。この結果は将来の量子コンピュータ上でこれらのアルゴリズムを実行することを約束しており、テンソルネットワークに基づくシミュレーション手法は他の量子アルゴリズムにも適用することができる。 Quantum algorithms reformulate computational problems as quantum evolutions in a large Hilbert space. Most quantum algorithms assume that the time-evolution is perfectly unitary and that the full Hilbert space is available. However, in practice, the available entanglement may be limited, leading to a reduced fidelity of the quantum algorithms. To simulate the execution of quantum algorithms with limited entanglement, tensor-network methods provide a useful framework, since they allow us to restrict the entanglement in a quantum circuit. Thus, we here use tensor-networks to analyze the fidelity of the quantum Fourier transform, Grover's algorithm, and the quantum counting algorithm as the entanglement is reduced, and we map out the entanglement that is generated during the execution of each algorithm. In all three cases, we find that the algorithms can be executed with high fidelity even if the entanglement is somewhat reduced. Our results are promising for the execution of these algorithms on future quantum computers, and our simulation method based on tensor networks may also be applied to other quantum algorithms.	公開日:2024-09-25 翻訳日:2024-11-09 15:24:36
# 作用素空間におけるシュミット分解による量子絡み合いの解析 Analyzing quantum entanglement with the Schmidt decomposition in operator space ( http://arxiv.org/abs/2304.02447v2 ) ライセンス: Link先を確認	Chengjie Zhang, Sophia Denker, Ali Asadian, Otfried Gühne,	(参考訳) 絡み合いを特徴づけることは量子情報科学の中心である。絡み合いを示す特別な観察用具、いわゆる絡み合い証人は、この作業に広く使用される道具である。これらの証人の構成は典型的には、いくつかの絡み合ったターゲット状態に対する高い忠実度を持つ量子状態も絡み合っているという観察に依存している。可観測物のシュミット分解に基づいて絡み合う証人を構築するための一般的な方法を提案する。この方法は、多体システム(多体システム)と二体システム(多体システム)で機能し、忠実度に基づく構造よりも強力である。得られた証人は、絡み合いを定量化したり、その次元を特徴づけるためにも使うことができる。最後に,本手法が絡み込み検出を大幅に改善する実験例について述べる。 Characterizing entanglement is central for quantum information science. Special observables which indicate entanglement, so-called entanglement witnesses, are a widely used tool for this task. The construction of these witnesses typically relies on the observation that quantum states with a high fidelity to some entangled target state are entangled, too. We introduce a general method to construct entanglement witnesses based on the Schmidt decomposition of observables. The method works for two- and, more importantly, many-body systems and is strictly stronger than fidelity-based constructions. The resulting witnesses can also be used to quantify entanglement as well as to characterize the dimensionality of it. Finally, we present experimentally relevant examples, where our approach improves entanglement detection significantly.	公開日:2024-09-27 翻訳日:2024-11-09 15:13:22
# 神経集団動態と幾何学の解釈可能な統計的表現 Interpretable statistical representations of neural population dynamics and geometry ( http://arxiv.org/abs/2304.03376v4 ) ライセンス: Link先を確認	Adam Gosztolai, Robert L. Peach, Alexis Arnaudon, Mauricio Barahona, Pierre Vandergheynst,	(参考訳) ニューロンの集団のダイナミクスは、低次元多様体上で一般的に進化する。したがって、解釈可能かつ一貫した潜在表現を推論するために、ニューラル多様体上の動的過程を学ぶ方法が必要である。そこで我々は,manifold dynamics を局所流れ場に分解する表現学習法 MARBLE を導入し,教師なしの幾何学的深層学習を用いて,それらを共通潜時空間にマッピングする。シミュレーションされた非線形力学系, 繰り返しニューラルネットワーク, 霊長類および象牙類からの実験的な単一ニューロン記録において, 利得変調, 意思決定, 内部状態の変化の間に高次元神経力学をパラメトリーする創発的な低次元潜在表現が発見された。これらの表現はニューラルネットワークや動物間で一貫性があり、認知計算の堅牢な比較を可能にする。広範囲なベンチマークでは、MARBLEの最先端の内的および対人的デコード精度が、現在の表現学習アプローチと比較して、最小限のユーザ入力で示される。この結果から, 多様体構造は, 強力な復号アルゴリズムを開発し, 実験間でデータを同化するために, 強力な帰納バイアスを与えることが示唆された。 The dynamics of neuron populations commonly evolve on low-dimensional manifolds. Thus, we need methods that learn the dynamical processes over neural manifolds to infer interpretable and consistent latent representations. We introduce a representation learning method, MARBLE, that decomposes on-manifold dynamics into local flow fields and maps them into a common latent space using unsupervised geometric deep learning. In simulated non-linear dynamical systems, recurrent neural networks, and experimental single-neuron recordings from primates and rodents, we discover emergent low-dimensional latent representations that parametrise high-dimensional neural dynamics during gain modulation, decision-making, and changes in the internal state. These representations are consistent across neural networks and animals, enabling the robust comparison of cognitive computations. Extensive benchmarking demonstrates state-of-the-art within- and across-animal decoding accuracy of MARBLE compared with current representation learning approaches, with minimal user input. Our results suggest that manifold structure provides a powerful inductive bias to develop powerful decoding algorithms and assimilate data across experiments.	公開日:2024-09-24 翻訳日:2024-11-09 15:13:22
# CRISP:階層強化学習のための原始インフォームドサブゴール予測のカリキュラム化 CRISP: Curriculum Inducing Primitive Informed Subgoal Prediction for Hierarchical Reinforcement Learning ( http://arxiv.org/abs/2304.03535v5 ) ライセンス: Link先を確認	Utsav Singh, Vinay P. Namboodiri,	(参考訳) 階層的強化学習(HRL)は、時間的抽象を用いて複雑な長い地平線問題を解く有望な手法である。しかし、低レベルのプリミティブが非定常である場合、高レベルのポリシーを訓練することが難しいため、同時にポリシー階層を学習することは不安定である。本稿では、強化学習と模倣学習を用いて、低レベルのプリミティブを進化させるための達成可能なサブゴールのカリキュラムを効果的に生成する新しいHRLアルゴリズムであるCRISPを提案する。 CRISPは低レベルのプリミティブを使用して、少数の専門家によるデモンストレーションで定期的にデータレバーベリングを行い、新しいプリミティブインフォメーションパーシング(PIP)アプローチを使用して、非定常性を緩和する。私たちのアプローチでは、少数の専門家によるデモンストレーションにしかアクセスできないので、ほとんどのロボット制御タスクに適しています。複雑なロボット迷路ナビゲーションとロボット操作タスクの実験的評価は、階層的なカリキュラム学習の導入がサンプル効率を大幅に改善し、時間的に拡張されたタスクを解決するための効率的な目標条件付きポリシーをもたらすことを示した。さらに,複雑な操作タスクにおける実世界のロボット実験を行い,CRISPが実世界のシナリオにおける印象的な一般化を実証した。 Hierarchical reinforcement learning (HRL) is a promising approach that uses temporal abstraction to solve complex long horizon problems. However, simultaneously learning a hierarchy of policies is unstable as it is challenging to train higher-level policy when the lower-level primitive is non-stationary. In this paper, we present CRISP, a novel HRL algorithm that effectively generates a curriculum of achievable subgoals for evolving lower-level primitives using reinforcement learning and imitation learning. CRISP uses the lower level primitive to periodically perform data relabeling on a handful of expert demonstrations, using a novel primitive informed parsing (PIP) approach, thereby mitigating non-stationarity. Since our approach only assumes access to a handful of expert demonstrations, it is suitable for most robotic control tasks. Experimental evaluations on complex robotic maze navigation and robotic manipulation tasks demonstrate that inducing hierarchical curriculum learning significantly improves sample efficiency, and results in efficient goal conditioned policies for solving temporally extended tasks. Additionally, we perform real world robotic experiments on complex manipulation tasks and demonstrate that CRISP demonstrates impressive generalization in real world scenarios.	公開日:2024-09-24 翻訳日:2024-11-09 15:13:22
# 多分、交通分析防衛のためのフレームワーク Maybenot: A Framework for Traffic Analysis Defenses ( http://arxiv.org/abs/2304.09510v2 ) ライセンス: Link先を確認	Tobias Pulls, Ethan Witwer,	(参考訳) エンドツーエンド暗号化は、インターネットユーザのプライバシーを保護する強力なツールである。 TorやVPN、暗号化メッセージングといった技術の利用の増加とともに、ネットワーク敵がインターネットトラフィックを監視して検閲することがますます難しくなってきている。トラフィック分析: 暗号化されたトラフィックのパターンを分析し、ユーザとその活動に関する情報を推測する。ディープラーニングによる最近の改善により、トラフィック分析攻撃はこれまで以上に効果的になった。我々は、交通分析防衛のためのフレームワークであるM maynotを提示する。おそらくnotは使いやすく、既存のエンドツーエンドの暗号化プロトコルに統合できるように設計されている。これはRustプログラミング言語でクレート(ライブラリ)として実装され、ディフェンスの開発をさらに進めるためのシミュレータとともに実装されている。 maynotのディフェンスは、パディングを注入したり、トラフィックをブロックしたりするためのアクションをスケジュールする確率的状態マシンとして表現される。おそらく、Perry氏とKadianakis氏によるTor Circuit Padding Frameworkからの進化であり、幅広いプロトコルとユースケースをサポートするように設計されている。 End-to-end encryption is a powerful tool for protecting the privacy of Internet users. Together with the increasing use of technologies such as Tor, VPNs, and encrypted messaging, it is becoming increasingly difficult for network adversaries to monitor and censor Internet traffic. One remaining avenue for adversaries is traffic analysis: the analysis of patterns in encrypted traffic to infer information about the users and their activities. Recent improvements using deep learning have made traffic analysis attacks more effective than ever before. We present Maybenot, a framework for traffic analysis defenses. Maybenot is designed to be easy to use and integrate into existing end-to-end encrypted protocols. It is implemented in the Rust programming language as a crate (library), together with a simulator to further the development of defenses. Defenses in Maybenot are expressed as probabilistic state machines that schedule actions to inject padding or block outgoing traffic. Maybenot is an evolution from the Tor Circuit Padding Framework by Perry and Kadianakis, designed to support a wide range of protocols and use cases.	公開日:2024-09-27 翻訳日:2024-11-09 15:13:22
# 個人データフローの可視化:Booking.comの事例から Visualising Personal Data Flows: Insights from a Case Study of Booking.com ( http://arxiv.org/abs/2304.09603v5 ) ライセンス: Link先を確認	Haiyue Yuan, Matthew Boakes, Xiao Ma, Dongmei Cao, Shujun Li,	(参考訳) 商業組織は、絶え間なく増加する個人情報を保持し、処理している。ポリシーや法律は、これらの企業がデータの収集、保管、処理、共有に関してより透明性を持たなければならないように、継続的に変更されている。本稿では、プライバシポリシから抽出した個人データフローを可視化するケーススタディとして、Booking.comを取り上げている。消費者の個人情報の共有方法を示すことによって、私たちは質問を提起し、プライバシポリシを使用してオンラインユーザに対して、個人データフローの真の規模と状況について通知する際の課題と制限に関する議論を拡大します。このケーススタディは、よりデータフロー指向のプライバシポリシ分析に関する今後の研究や、複雑なビジネスエコシステムにおける個人データフローに関するより包括的なオントロジーの構築について教えてくれます。 Commercial organisations are holding and processing an ever-increasing amount of personal data. Policies and laws are continually changing to require these companies to be more transparent regarding the collection, storage, processing and sharing of this data. This paper reports our work of taking Booking.com as a case study to visualise personal data flows extracted from their privacy policy. By showcasing how the company shares its consumers' personal data, we raise questions and extend discussions on the challenges and limitations of using privacy policies to inform online users about the true scale and the landscape of personal data flows. This case study can inform us about future research on more data flow-oriented privacy policy analysis and on the construction of a more comprehensive ontology on personal data flows in complicated business ecosystems.	公開日:2024-09-20 翻訳日:2024-11-09 15:13:22
# CKBP v2: Commonsense Knowledge Base Populationのためのアノテーションと推論の改善 CKBP v2: Better Annotation and Reasoning for Commonsense Knowledge Base Population ( http://arxiv.org/abs/2304.10392v2 ) ライセンス: Link先を確認	Tianqing Fang, Quyet V. Do, Zihao Zheng, Weiqi Wang, Sehyun Choi, Zhaowei Wang, Yangqiu Song,	(参考訳) Commonsense Knowledge Bases (CSKB) Populationは、CSKBの知識を外部リソースで自動的に拡張することを目的としており、NLPにおいて重要なタスクである。 Fang et al (2021a) は CKBP v1 の評価セットを持つ CSKB Population (CKBP) フレームワークを提案した。しかし、CKBP v1は、かなりの数の誤った回答に苦しむクラウドソースアノテーションに依存しており、評価セットはランダムサンプリングによる外部知識ソースとの整合性に欠ける。本稿では,上記の2つの問題に,ドメインエキスパートをアノテータとして採用し,多種多様な反対サンプルを取り入れて,評価データをより代表的なものにすることで対処する,高品質なCSKB集団評価セットであるCKBP v2を紹介する。 CKBP v2 は CSKB Population タスクの挑戦的,代表的評価データセットとして機能し,その開発セットは,下流コモンセンス推論の知識獲得に寄与する集団モデルの選択を支援する。より良い人口モデルは、生成的コモンセンス推論とゼロショットコモンセンス質問応答の両方の監視信号として、より情報的なコモンセンス知識を得るのに役立つ。具体的には、DeBERTa-v3-large(He et al , 2023b)に基づく質問応答モデルは、ChatGPTやGPT-3.5など、ゼロショット設定で強力な大規模言語モデルよりも優れている。 Commonsense Knowledge Bases (CSKB) Population, which aims at automatically expanding knowledge in CSKBs with external resources, is an important yet hard task in NLP. Fang et al. (2021a) proposed a CSKB Population (CKBP) framework with an evaluation set CKBP v1. However, CKBP v1 relies on crowdsourced annotations that suffer from a considerable number of mislabeled answers, and the evaluationset lacks alignment with the external knowledge source due to random sampling. In this paper, we introduce CKBP v2, a new high-quality CSKB Population evaluation set that addresses the two aforementioned issues by employing domain experts as annotators and incorporating diversified adversarial samples to make the evaluation data more representative. We show that CKBP v2 serves as a challenging and representative evaluation dataset for the CSKB Population task, while its development set aids in selecting a population model that leads to improved knowledge acquisition for downstream commonsense reasoning. A better population model can also help acquire more informative commonsense knowledge as additional supervision signals for both generative commonsense inference and zero-shot commonsense question answering. Specifically, the question-answering model based on DeBERTa-v3-large (He et al., 2023b) even outperforms powerful large language models in a zero-shot setting, including ChatGPT and GPT-3.5.	公開日:2024-09-21 翻訳日:2024-11-09 15:13:22
# 非凸非平滑最適化問題に対する射影近位勾配:クルディカ・ロジャシエヴィチ(KL)特性のない高速収束 Projective Proximal Gradient Descent for A Class of Nonconvex Nonsmooth Optimization Problems: Fast Convergence Without Kurdyka-Lojasiewicz (KL) Property ( http://arxiv.org/abs/2304.10499v2 ) ライセンス: Link先を確認	Yingzhen Yang, Ping Li,	(参考訳) 非凸および非滑らかな最適化問題は統計学と機械学習にとって重要かつ困難な問題である。本稿では,非凸・非平滑な最適化問題のクラスを非凸・非平滑な非平滑な正規化項から解き,非凸・非平滑な最適化問題であるPGD(Projected Proximal Gradient Descent)を提案する。クルディカ・オジャシエヴィチ(K\L{}ojasiewicz)の性質に基づく非凸および非滑らか問題に対する加速PGD法の既存の収束解析とは対照的に、PPGDの局所的高速収束を示す新しい理論解析を提供する。 PPGDは、緩やかな仮定の下での非凸および非滑らかな問題のクラスにおいて、反復数 $k \ge k_0$ for a finite $k_0$ に対して $\cO(1/k^2)$ の高速収束率を達成することが証明された。実験の結果, PPGDの有効性が示された。 Nonconvex and nonsmooth optimization problems are important and challenging for statistics and machine learning. In this paper, we propose Projected Proximal Gradient Descent (PPGD) which solves a class of nonconvex and nonsmooth optimization problems, where the nonconvexity and nonsmoothness come from a nonsmooth regularization term which is nonconvex but piecewise convex. In contrast with existing convergence analysis of accelerated PGD methods for nonconvex and nonsmooth problems based on the Kurdyka-\L{}ojasiewicz (K\L{}) property, we provide a new theoretical analysis showing local fast convergence of PPGD. It is proved that PPGD achieves a fast convergence rate of $\cO(1/k^2)$ when the iteration number $k \ge k_0$ for a finite $k_0$ on a class of nonconvex and nonsmooth problems under mild assumptions, which is locally Nesterov's optimal convergence rate of first-order methods on smooth and convex objective function with Lipschitz continuous gradient. Experimental results demonstrate the effectiveness of PPGD.	公開日:2024-09-25 翻訳日:2024-11-09 15:13:22
# RoCOCO:MS-COCOのストレステスト画像テキストマッチングモデルに対するロバスト性ベンチマーク RoCOCO: Robustness Benchmark of MS-COCO to Stress-test Image-Text Matching Models ( http://arxiv.org/abs/2304.10727v4 ) ライセンス: Link先を確認	Seulki Park, Daeho Um, Hajung Yoon, Sanghyuk Chun, Sangdoo Yun,	(参考訳) 様々な下流タスクで視覚言語モデルが広く使われているため、その堅牢性を評価することが重要である。本稿では,視覚言語モデルのロバスト性を評価するためのベンチマークを提案する。我々は、ロバストモデルが言語的意味論と視覚的意味論の両方を適切に理解し、明示的なバリエーションに耐性があることを信じている。この目的を追求するため、MS-COCOテストセットにテキストと画像の新しい変種を作成し、新しいデータを用いてSOTA(State-of-the-art)モデルを再評価する。具体的には、単語を置換してテキストの意味を変更し、画像ミキシング技術を用いて視覚的に変化した画像を生成する。提案したベンチマークでは、多くのSOTAモデル(例えば、画像からテキストへのリコール@1:81.9\% $\rightarrow$ 48.4\%、BLIP 66.1\% $\rightarrow$ 37.6\%、VSE$\infty$)において、大きなパフォーマンス劣化を示す。これは、現在の視覚言語モデルは微妙な変化に悩まされ、しばしばテキストや画像の全体的なコンテキストを理解するのに失敗していることを示している。これらの知見に基づき,より堅牢な埋め込み学習のために,意味的コントラスト損失と視覚的コントラスト損失を提案する。データセットとコードは {\url{https://github.com/pseulki/rococo}}で入手できる。 With the extensive use of vision-language models in various downstream tasks, evaluating their robustness is crucial. In this paper, we propose a benchmark for assessing the robustness of vision-language models. We believe that a robust model should properly understand both linguistic and visual semantics and be resilient to explicit variations. In pursuit of this goal, we create new variants of texts and images in the MS-COCO test set and re-evaluate the state-of-the-art (SOTA) models with the new data. Specifically, we alter the meaning of text by replacing a word, and generate visually altered images that maintain some visual context while introducing noticeable pixel changes through image mixing techniques.Our evaluations on the proposed benchmark reveal substantial performance degradation in many SOTA models (e.g., Image-to-Text Recall@1: 81.9\% $\rightarrow$ 48.4\% in BLIP, 66.1\% $\rightarrow$ 37.6\% in VSE$\infty$), with the models often favoring the altered texts/images over the original ones. This indicates the current vision-language models struggle with subtle changes and often fail to understand the overall context of texts and images. Based on these findings, we propose semantic contrastive loss and visual contrastive loss to learn more robust embedding. Datasets and code are available at {\url{https://github.com/pseulki/rococo}}.	公開日:2024-09-27 翻訳日:2024-11-09 15:13:22
# サービス拒否とファイングラインド制御--フレキシブルモデルによるフェデレート学習への攻撃に向けて Denial-of-Service or Fine-Grained Control: Towards Flexible Model Poisoning Attacks on Federated Learning ( http://arxiv.org/abs/2304.10783v3 ) ライセンス: Link先を確認	Hangtao Zhang, Zeming Yao, Leo Yu Zhang, Shengshan Hu, Chao Chen, Alan Liew, Zhetao Li,	(参考訳) フェデレーテッド・ラーニング(FL)は、敵がグローバルアグリゲーションの結果を腐敗させ、DoS(DoS)を否定する有害な攻撃に対して脆弱である。特定方向の悪意的摂動の振幅を最適化してDoSを発生させる最近のモデル中毒攻撃とは違って,汎用的な攻撃目標を達成するフレキシブルモデル中毒攻撃(FMPA)を提案する。 FLシステムに関する余分な知識(例えば、アグリゲーションルールやベニグナブルデバイスのアップデートなど)を敵に提供できない現実的な脅威シナリオを考える。 FMPAは、グローバルな歴史的情報を利用して、グローバルモデルの次のラウンドを良心的な参照として予測する推定器を構築する。その後、基準モデルを微調整し、低い精度と小さな摂動で所望の有毒モデルを得る。 DoSを発生させる目的の他に、FMPAを自然に拡張して細かい制御可能な攻撃を発射することで、グローバルな精度を正確に低減することができる。厳格なコントロールで武装した悪意のあるFLサービスプロバイダは、注意を払わずに競合相手に対してアドバンテージを得られるため、DoS以外のFLに新たな攻撃サーフェスを開くことができる。 DoSの目的においても、FMPAは世界の精度を著しく低下させ、最先端の6つの攻撃を上回ります。 Federated learning (FL) is vulnerable to poisoning attacks, where adversaries corrupt the global aggregation results and cause denial-of-service (DoS). Unlike recent model poisoning attacks that optimize the amplitude of malicious perturbations along certain prescribed directions to cause DoS, we propose a Flexible Model Poisoning Attack (FMPA) that can achieve versatile attack goals. We consider a practical threat scenario where no extra knowledge about the FL system (e.g., aggregation rules or updates on benign devices) is available to adversaries. FMPA exploits the global historical information to construct an estimator that predicts the next round of the global model as a benign reference. It then fine-tunes the reference model to obtain the desired poisoned model with low accuracy and small perturbations. Besides the goal of causing DoS, FMPA can be naturally extended to launch a fine-grained controllable attack, making it possible to precisely reduce the global accuracy. Armed with precise control, malicious FL service providers can gain advantages over their competitors without getting noticed, hence opening a new attack surface in FL other than DoS. Even for the purpose of DoS, experiments show that FMPA significantly decreases the global accuracy, outperforming six state-of-the-art attacks.	公開日:2024-09-26 翻訳日:2024-11-09 15:13:22
# 確率的エージェントドロップアウト下におけるマルチエージェントMDPのモデル自由学習と最適ポリシー設計 Model-Free Learning and Optimal Policy Design in Multi-Agent MDPs Under Probabilistic Agent Dropout ( http://arxiv.org/abs/2304.12458v2 ) ライセンス: Link先を確認	Carmel Fiscko, Soummya Kar, Bruno Sinopoli,	(参考訳) 本研究では,エージェントドロップアウトを行うマルチエージェントマルコフ決定プロセス(MDP)と,事前ドロップアウトシステムの制御とサンプリングに基づくポストドロップアウトシステムのポリシーの計算について検討する。中央プランナーの目的は、エージェントのドロップアウト確率の事前知識が与えられた場合、期待されるシステムの価値を最大化する最適なポリシーを見つけることである。特定の遷移独立性と報酬分離性構造を持つMDPに対して、システムからエージェントを取り除くことは、新しい状態と行動空間を持つ残りのエージェントと、除去されたエージェントを疎外する遷移ダイナミクスと、除去されたエージェントとは独立な報酬からなる新しいMDPを形成すると仮定する。この「ロバストMDP」は、Nがエージェント数を表すようなシステムの2ドルN$実現度を全て評価する必要性を排除している。さらに、モデルフリーの文脈では、ロバストなMDP値を事前ドロップアウトシステムによって生成されたサンプルで推定できることが示され、つまり、ドロップアウトが起こる前にロバストなポリシーを見つけることができる。この事実は、ドロップアウトシナリオに対するポリシー評価を行うための政策重要サンプリング(IS)ルーチンの提案に利用され、既存のシステムを適切な事前ドロップアウトポリシーで制御する。ポリシーISルーチンは、堅牢なMDPと特定のドロップアウトシステムの実現の両方に対して値推定を生成し、指数的信頼境界で正当化される。最後に、このアプローチの有用性をシミュレーションで検証し、エージェントのドロップアウトの構造的特性が、ドロップアウトが起こる前にコントローラが優れたドロップアウトポリシーを見つけるのにどう役立つかを示す。 This work studies a multi-agent Markov decision process (MDP) that can undergo agent dropout and the computation of policies for the post-dropout system based on control and sampling of the pre-dropout system. The central planner's objective is to find an optimal policy that maximizes the value of the expected system given a priori knowledge of the agents' dropout probabilities. For MDPs with a certain transition independence and reward separability structure, we assume that removing agents from the system forms a new MDP comprised of the remaining agents with new state and action spaces, transition dynamics that marginalize the removed agents, and rewards that are independent of the removed agents. We first show that under these assumptions, the value of the expected post-dropout system can be represented by a single MDP; this "robust MDP" eliminates the need to evaluate all $2^N$ realizations of the system, where N denotes the number of agents. More significantly, in a model-free context, it is shown that the robust MDP value can be estimated with samples generated by the pre-dropout system, meaning that robust policies can be found before dropout occurs. This fact is used to propose a policy importance sampling (IS) routine that performs policy evaluation for dropout scenarios while controlling the existing system with good pre-dropout policies. The policy IS routine produces value estimates for both the robust MDP and specific post-dropout system realizations and is justified with exponential confidence bounds. Finally, the utility of this approach is verified in simulation, showing how structural properties of agent dropout can help a controller find good post-dropout policies before dropout occurs.	公開日:2024-09-22 翻訳日:2024-11-09 15:13:22
# 1ビット行列補完のための正規化最小化ガウスニュートン法 A Majorization-Minimization Gauss-Newton Method for 1-Bit Matrix Completion ( http://arxiv.org/abs/2304.13940v3 ) ライセンス: Link先を確認	Xiaoqian Liu, Xu Han, Eric C. Chi, Boaz Nadler,	(参考訳) 1ビット行列の完備化では、基礎となる低ランク行列をバイナリー観測の部分集合から推定することを目的としている。本稿では,Majorization-Minimization Gauss-Newton (MMGN) と呼ばれる新しい1ビット行列補完法を提案する。本手法は,元の最適化問題を標準的な低ランク行列補完問題に変換する偏極最小化原理に基づく。これらのサブプロブレムのそれぞれを、仮定された低ランク構造を明示的に強制する分解法により解き、その後、ガウス・ニュートン法を適用する。シミュレーションと実データ例を用いて、既存の1ビット行列補完法と比較して、MMGNはより正確な推定値でない場合に匹敵する出力を出力する。加えて、これはしばしば著しく速く、下層のマトリックスのスパイキネスに敏感でない。元の目的を直接最小化する3つの標準的な汎用最適化手法と比較して、MMGNは特に観測された成分のごく一部が小さい場合に、明確な計算上の優位性を示す。 In 1-bit matrix completion, the aim is to estimate an underlying low-rank matrix from a partial set of binary observations. We propose a novel method for 1-bit matrix completion called Majorization-Minimization Gauss-Newton (MMGN). Our method is based on the majorization-minimization principle, which converts the original optimization problem into a sequence of standard low-rank matrix completion problems. We solve each of these sub-problems by a factorization approach that explicitly enforces the assumed low-rank structure and then apply a Gauss-Newton method. Using simulations and a real data example, we illustrate that in comparison to existing 1-bit matrix completion methods, MMGN outputs comparable if not more accurate estimates. In addition, it is often significantly faster, and less sensitive to the spikiness of the underlying matrix. In comparison with three standard generic optimization approaches that directly minimize the original objective, MMGN also exhibits a clear computational advantage, especially when the fraction of observed entries is small.	公開日:2024-09-23 翻訳日:2024-11-09 15:13:22
# 移動エゴ車からのイベントフリー移動物体セグメンテーション Event-Free Moving Object Segmentation from Moving Ego Vehicle ( http://arxiv.org/abs/2305.00126v3 ) ライセンス: Link先を確認	Zhuyun Zhou, Zongwei Wu, Danda Pani Paudel, Rémi Boutteau, Fan Yang, Luc Van Gool, Radu Timofte, Dominique Ginhac,	(参考訳) 動的シーンにおける移動物体セグメンテーション(MOS)は、特に移動するエゴ車から得られるシーケンスについて、重要な、難しい、しかし未調査の研究テーマである。ほとんどのセグメンテーション法は、光学フローマップから得られるモーションキューを利用する。しかし、これらの手法は連続するRGBフレームから事前計算される光学的流れに基づいていることが多いため、フレーム内で発生した事象の時間的考慮を無視して、相対的な静的性を示すが実際に動いている物体を識別する能力を制限する。これらの制約に対処するために,光学的フローに頼ることなくリッチなモーションキューを提供する,より優れた映像理解のためのイベントカメラの利用を提案する。この分野での研究を促進するために、我々はまずDSEC-MOSと呼ばれる新しい大規模データセットを導入し、移動中のエゴ車から物体のセグメンテーションを移動させる。ベンチマークでは、さまざまな主流メソッドを選択し、データセット上でそれらを厳格に評価する。その後、イベントデータを活用可能な新しいネットワークであるEmoFormerを考案した。この目的のために、時間的前兆を空間意味マップと融合させ、実際に動く物体を静的な背景から区別し、興味のある物体の周囲に別のレベルの集中的な監督を加える。提案するネットワークは,トレーニングにイベントデータのみに依存するが,推論時にイベント入力を必要としないため,効率の面でフレームのみの手法と直接的に比較でき,多くのアプリケーションでより広く利用することができる。徹底的な比較は、他のすべての方法と比較して、我々の手法の大幅な性能向上を浮き彫りにしている。ソースコードとデータセットは、https://github.com/ZZY-Zhou/DSEC-MOSで公開されている。 Moving object segmentation (MOS) in dynamic scenes is an important, challenging, but under-explored research topic for autonomous driving, especially for sequences obtained from moving ego vehicles. Most segmentation methods leverage motion cues obtained from optical flow maps. However, since these methods are often based on optical flows that are pre-computed from successive RGB frames, this neglects the temporal consideration of events occurring within the inter-frame, consequently constraining its ability to discern objects exhibiting relative staticity but genuinely in motion. To address these limitations, we propose to exploit event cameras for better video understanding, which provide rich motion cues without relying on optical flow. To foster research in this area, we first introduce a novel large-scale dataset called DSEC-MOS for moving object segmentation from moving ego vehicles, which is the first of its kind. For benchmarking, we select various mainstream methods and rigorously evaluate them on our dataset. Subsequently, we devise EmoFormer, a novel network able to exploit the event data. For this purpose, we fuse the event temporal prior with spatial semantic maps to distinguish genuinely moving objects from the static background, adding another level of dense supervision around our object of interest. Our proposed network relies only on event data for training but does not require event input during inference, making it directly comparable to frame-only methods in terms of efficiency and more widely usable in many application cases. The exhaustive comparison highlights a significant performance improvement of our method over all other methods. The source code and dataset are publicly available at: https://github.com/ZZY-Zhou/DSEC-MOS.	公開日:2024-09-25 翻訳日:2024-11-09 15:13:22
# 人工知能によるアグリフードシステムの構築 : 進歩・課題・機会に関する調査 Empowering Agrifood System with Artificial Intelligence: A Survey of the Progress, Challenges and Opportunities ( http://arxiv.org/abs/2305.01899v2 ) ライセンス: Link先を確認	Tao Chen, Liang Lv, Di Wang, Jing Zhang, Yue Yang, Zeyang Zhao, Chen Wang, Xiaowei Guo, Hao Chen, Qingye Wang, Yufei Xu, Qiming Zhang, Bo Du, Liangpei Zhang, Dacheng Tao,	(参考訳) 世界人口が急増するにつれて、アグリフードのシステムはより生産的、効率的、安全、持続的へと変化し、潜在的な食糧不足を緩和するためには不可欠である。近年、ディープラーニング(DL)のような人工知能(AI)技術は、言語、視覚、リモートセンシング(RS)、アグリフードシステムアプリケーションなど、様々な分野でその強みを実証している。しかし、アグリフードシステムに対するAIの全体的な影響は、まだ不明である。本稿では,AI技術がアグリフードシステムをどのように変革し,現代のアグリフード産業に貢献するかを,徹底的にレビューする。まず,アグリファドシステムにおけるデータ取得手法について概説する。第2に,農業,畜産,漁業などのアグリフードシステムにおけるAI手法の進歩を概観し,アグリフード分類,成長モニタリング,収量予測,品質評価などのトピックについて紹介する。さらに、AIで現代のアグリファドシステムを変革するための潜在的な課題と有望な研究機会を強調します。この調査が、この分野の新参者に全体像を提供し、さらなる研究の出発点になることを期待している。プロジェクトのWebサイトはhttps://github.com/Frenkie14/Agrifood-Surveyである。 With the world population rapidly increasing, transforming our agrifood systems to be more productive, efficient, safe, and sustainable is crucial to mitigate potential food shortages. Recently, artificial intelligence (AI) techniques such as deep learning (DL) have demonstrated their strong abilities in various areas, including language, vision, remote sensing (RS), and agrifood systems applications. However, the overall impact of AI on agrifood systems remains unclear. In this paper, we thoroughly review how AI techniques can transform agrifood systems and contribute to the modern agrifood industry. Firstly, we summarize the data acquisition methods in agrifood systems, including acquisition, storage, and processing techniques. Secondly, we present a progress review of AI methods in agrifood systems, specifically in agriculture, animal husbandry, and fishery, covering topics such as agrifood classification, growth monitoring, yield prediction, and quality assessment. Furthermore, we highlight potential challenges and promising research opportunities for transforming modern agrifood systems with AI. We hope this survey could offer an overall picture to newcomers in the field and serve as a starting point for their further research. The project website is https://github.com/Frenkie14/Agrifood-Survey.	公開日:2024-09-26 翻訳日:2024-11-09 15:13:22
# 実世界3次元シュミレーションを伴わない手の物体の3次元再構成 3D Reconstruction of Objects in Hands without Real World 3D Supervision ( http://arxiv.org/abs/2305.03036v2 ) ライセンス: Link先を確認	Aditya Prakash, Matthew Chang, Matthew Jin, Ruisen Tu, Saurabh Gupta,	(参考訳) 以前は、手持ちの物体を1枚のイメージトレインモデルから3次元形状と組み合わせた画像に再構成する作業を行っていた。このようなデータは、現実の世界で大規模に収集することは困難である。したがって、これらの手法は、新しいオブジェクトをウィジェット内で提示する際には、うまく一般化しない。 3Dの監督は大きなボトルネックだが、多岐にわたる。 a)手動物体の相互作用と映像データ b) 合成3次元形状コレクション本稿では,これらのソースから3Dインスペクションを活用するモジュールを提案し,ハンドヘルドオブジェクトの再構築のためのモデル学習をスケールアップする。具体的には、ビデオから多視点2Dマスクの監視を抽出し、形状収集から3次元形状の前兆を抽出する。我々はこれらの間接的な3次元キューを用いて、単一のRGB画像から物体の3次元形状を予測する占有ネットワークを訓練する。既存のデータセットを3Dで教師するモデルよりも11.6%の相対的な改善が見られた。 Prior works for reconstructing hand-held objects from a single image train models on images paired with 3D shapes. Such data is challenging to gather in the real world at scale. Consequently, these approaches do not generalize well when presented with novel objects in in-the-wild settings. While 3D supervision is a major bottleneck, there is an abundance of a) in-the-wild raw video data showing hand-object interactions and b) synthetic 3D shape collections. In this paper, we propose modules to leverage 3D supervision from these sources to scale up the learning of models for reconstructing hand-held objects. Specifically, we extract multiview 2D mask supervision from videos and 3D shape priors from shape collections. We use these indirect 3D cues to train occupancy networks that predict the 3D shape of objects from a single RGB image. Our experiments in the challenging object generalization setting on in-the-wild MOW dataset show 11.6% relative improvement over models trained with 3D supervision on existing datasets.	公開日:2024-09-23 翻訳日:2024-11-09 15:13:22
# 大規模言語モデルのための高速分散推論 Fast Distributed Inference Serving for Large Language Models ( http://arxiv.org/abs/2305.05920v2 ) ライセンス: Link先を確認	Bingyang Wu, Yinmin Zhong, Zili Zhang, Gang Huang, Xuanzhe Liu, Xin Jin,	(参考訳) 大規模言語モデル(LLM)は、ChatGPTで実証された対話型AIアプリケーションの新しい世代のパワーである。これらのアプリケーションのインタラクティブな性質は、LLM推論に低レイテンシを必要とする。既存のLLMサービスシステムは、ライン・オブ・ラインのブロッキングと長時間の待ち時間に悩まされる推論ジョブに対して、実行から補完処理を使用する。 LLMのための分散推論サービスシステムであるFastServeについて述べる。 FastServeはLLM推論の自己回帰パターンを利用して、各出力トークンの粒度のプリエンプションを可能にする。 FastServeはプリエンプティブスケジューリングを使用して、新しいスキップジョイントマルチレベルフィードバックキュースケジューラでレイテンシを最小限にする。 LLM推論の新たな半情報非依存設定に基づいて、スケジューラは入力長情報を利用して、到着する各ジョブに適切な初期キューを割り当てる。結合キューよりも優先度の高いキューは、削除を減らすためにスキップされる。我々は、LLM推論のためのGPUメモリとホストメモリの中間状態を積極的にオフロードし、アップロードする効率的なGPUメモリ管理機構を設計する。我々は,FastServeのシステムプロトタイプを構築し,最先端のソリューションであるvLLMと比較して,同じ平均および末尾遅延条件下でのスループットを最大31.4xと17.9xに改善したことを示す。 Large language models (LLMs) power a new generation of interactive AI applications exemplified by ChatGPT. The interactive nature of these applications demands low latency for LLM inference. Existing LLM serving systems use run-to-completion processing for inference jobs, which suffers from head-of-line blocking and long latency. We present FastServe, a distributed inference serving system for LLMs. FastServe exploits the autoregressive pattern of LLM inference to enable preemption at the granularity of each output token. FastServe uses preemptive scheduling to minimize latency with a novel skip-join Multi-Level Feedback Queue scheduler. Based on the new semi-information-agnostic setting of LLM inference, the scheduler leverages the input length information to assign an appropriate initial queue for each arrival job to join. The higher priority queues than the joined queue are skipped to reduce demotions. We design an efficient GPU memory management mechanism that proactively offloads and uploads intermediate state between GPU memory and host memory for LLM inference. We build a system prototype of FastServe and experimental results show that compared to the state-of-the-art solution vLLM, FastServe improves the throughput by up to 31.4x and 17.9x under the same average and tail latency requirements, respectively.	公開日:2024-09-25 翻訳日:2024-11-09 15:13:22
# 大規模言語モデルのための高速分散推論 Fast Distributed Inference Serving for Large Language Models ( http://arxiv.org/abs/2305.05920v3 ) ライセンス: Link先を確認	Bingyang Wu, Yinmin Zhong, Zili Zhang, Shengyu Liu, Fangyue Liu, Yuanhang Sun, Gang Huang, Xuanzhe Liu, Xin Jin,	(参考訳) 大規模言語モデル(LLM)は、ChatGPTで実証された対話型AIアプリケーションの新しい世代のパワーである。これらのアプリケーションのインタラクティブな性質は、LLM推論に低レイテンシを必要とする。既存のLLMサービスシステムは、ライン・オブ・ラインのブロッキングと長時間の待ち時間に悩まされる推論ジョブに対して、実行から補完処理を使用する。 LLMのための分散推論サービスシステムであるFastServeについて述べる。 FastServeはLLM推論の自己回帰パターンを利用して、各出力トークンの粒度のプリエンプションを可能にする。 FastServeはプリエンプティブスケジューリングを使用して、新しいスキップジョイントマルチレベルフィードバックキュースケジューラでレイテンシを最小限にする。 LLM推論の新たな半情報非依存設定に基づいて、スケジューラは入力長情報を利用して、到着する各ジョブに適切な初期キューを割り当てる。結合キューよりも優先度の高いキューは、削除を減らすためにスキップされる。我々は、LLM推論のためのGPUメモリとホストメモリの中間状態を積極的にオフロードし、アップロードする効率的なGPUメモリ管理機構を設計する。我々は,FastServeのシステムプロトタイプを構築し,最先端のソリューションであるvLLMと比較して,同じ平均および末尾遅延条件下でのスループットを最大31.4xと17.9xに改善したことを示す。 Large language models (LLMs) power a new generation of interactive AI applications exemplified by ChatGPT. The interactive nature of these applications demands low latency for LLM inference. Existing LLM serving systems use run-to-completion processing for inference jobs, which suffers from head-of-line blocking and long latency. We present FastServe, a distributed inference serving system for LLMs. FastServe exploits the autoregressive pattern of LLM inference to enable preemption at the granularity of each output token. FastServe uses preemptive scheduling to minimize latency with a novel skip-join Multi-Level Feedback Queue scheduler. Based on the new semi-information-agnostic setting of LLM inference, the scheduler leverages the input length information to assign an appropriate initial queue for each arrival job to join. The higher priority queues than the joined queue are skipped to reduce demotions. We design an efficient GPU memory management mechanism that proactively offloads and uploads intermediate state between GPU memory and host memory for LLM inference. We build a system prototype of FastServe and experimental results show that compared to the state-of-the-art solution vLLM, FastServe improves the throughput by up to 31.4x and 17.9x under the same average and tail latency requirements, respectively.	公開日:2024-09-25 翻訳日:2024-11-09 15:13:22
# CADGE: グラフ構造化知識集約による文脈認識対話生成 CADGE: Context-Aware Dialogue Generation Enhanced with Graph-Structured Knowledge Aggregation ( http://arxiv.org/abs/2305.06294v4 ) ライセンス: Link先を確認	Hongbo Zhang, Chen Tang, Tyler Loakman, Bohao Yang, Stefan Goetze, Chenghua Lin,	(参考訳) 常識知識は多くの自然言語処理タスクに不可欠である。既存の研究は通常、グラフ知識を従来のグラフニューラルネットワーク(GNN)に組み込む。しかし、この区画化は、これらの2種類の入力知識間の文脈的相互作用を完全に活用するわけではない。本稿では,文脈対応グラフアテンションモデル (Context-aware graph-attention model) を提案する。具体的には、フラットなグラフ知識とテキストデータとを融合させることにより、不均一な特徴を調和させる表現学習に革新的なアプローチを採用する。コンテクスト情報によって補完される連結部分グラフにおけるグラフ知識集約の階層的適用により、コモンセンス駆動対話の生成を促進する。実験により,本フレームワークは従来のGNNベース言語モデルよりも性能が優れていることが示された。自動評価と人的評価の両面から,提案モデルのフローベースラインに対する性能向上が確認できた。 Commonsense knowledge is crucial to many natural language processing tasks. Existing works usually incorporate graph knowledge with conventional graph neural networks (GNNs), resulting in a sequential pipeline that compartmentalizes the encoding processes for textual and graph-based knowledge. This compartmentalization does, however, not fully exploit the contextual interplay between these two types of input knowledge. In this paper, a novel context-aware graph-attention model (Context-aware GAT) is proposed, designed to effectively assimilate global features from relevant knowledge graphs through a context-enhanced knowledge aggregation mechanism. Specifically, the proposed framework employs an innovative approach to representation learning that harmonizes heterogeneous features by amalgamating flattened graph knowledge with text data. The hierarchical application of graph knowledge aggregation within connected subgraphs, complemented by contextual information, to bolster the generation of commonsense-driven dialogues is analyzed. Empirical results demonstrate that our framework outperforms conventional GNN-based language models in terms of performance. Both, automated and human evaluations affirm the significant performance enhancements achieved by our proposed model over the concept flow baseline.	公開日:2024-09-22 翻訳日:2024-11-09 15:13:22
# 脳腫瘍セグメンテーション(BraTS)課題 : 塗布による健康な脳組織の局所的合成 The Brain Tumor Segmentation (BraTS) Challenge: Local Synthesis of Healthy Brain Tissue via Inpainting ( http://arxiv.org/abs/2305.08992v3 ) ライセンス: Link先を確認	Florian Kofler, Felix Meissen, Felix Steinbauer, Robert Graf, Stefan K Ehrlich, Annika Reinke, Eva Oswald, Diana Waldmannstetter, Florian Hoelzl, Izabela Horvath, Oezguen Turgut, Suprosanna Shit, Christina Bukas, Kaiyuan Yang, Johannes C. Paetzold, Ezequiel de da Rosa, Isra Mekki, Shankeeth Vinayahalingam, Hasan Kassem, Juexin Zhang, Ke Chen, Ying Weng, Alicia Durrer, Philippe C. Cattin, Julia Wolleb, M. S. Sadique, M. M. Rahman, W. Farzana, A. Temtam, K. M. Iftekharuddin, Maruf Adewole, Syed Muhammad Anwar, Ujjwal Baid, Anastasia Janas, Anahita Fathi Kazerooni, Dominic LaBella, Hongwei Bran Li, Ahmed W Moawad, Gian-Marco Conte, Keyvan Farahani, James Eddy, Micah Sheller, Sarthak Pati, Alexandros Karagyris, Alejandro Aristizabal, Timothy Bergquist, Verena Chung, Russell Takeshi Shinohara, Farouk Dako, Walter Wiggins, Zachary Reitman, Chunhao Wang, Xinyang Liu, Zhifan Jiang, Elaine Johanson, Zeke Meier, Ariana Familiar, Christos Davatzikos, John Freymann, Justin Kirby, Michel Bilello, Hassan M Fathallah-Shaykh, Roland Wiest, Jan Kirschke, Rivka R Colen, Aikaterini Kotrotsou, Pamela Lamontagne, Daniel Marcus, Mikhail Milchenko, Arash Nazeri, Marc-André Weber, Abhishek Mahajan, Suyash Mohan, John Mongan, Christopher Hess, Soonmee Cha, Javier Villanueva-Meyer, Errol Colak, Priscila Crivellaro, Andras Jakab, Abiodun Fatade, Olubukola Omidiji, Rachel Akinola Lagos, O O Olatunji, Goldey Khanna, John Kirkpatrick, Michelle Alonso-Basanta, Arif Rashid, Miriam Bornhorst, Ali Nabavizadeh, Natasha Lepore, Joshua Palmer, Antonio Porras, Jake Albrecht, Udunna Anazodo, Mariam Aboian, Evan Calabrese, Jeffrey David Rudie, Marius George Linguraru, Juan Eugenio Iglesias, Koen Van Leemput, Spyridon Bakas, Benedikt Wiestler, Ivan Ezhov, Marie Piraud, Bjoern H Menze,	(参考訳) 脳MR画像の自動解析のための無数のアルゴリズムが、臨床医の意思決定を支援するために利用可能である。脳腫瘍患者の場合、画像取得の時系列は通常、すでに病理的なスキャンから始まる。多くのアルゴリズムは健康な脳を解析し、病変を特徴とする画像の保証を提供しない。例えば、脳解剖学のパーセレーション、組織セグメンテーション、脳抽出のアルゴリズムがある。このジレンマを解決するために,BraTS塗装の課題を紹介する。そこで参加者は、損傷した脳から健康な脳スキャンを合成するための塗装技術を探る。下記の原稿にはタスクの定式化、データセット、提出手順が含まれている。その後、課題の調査結果をまとめるために更新される。この挑戦はASNR-BraTS MICCAIチャレンジの一部として組織されている。 A myriad of algorithms for the automatic analysis of brain MR images is available to support clinicians in their decision-making. For brain tumor patients, the image acquisition time series typically starts with an already pathological scan. This poses problems, as many algorithms are designed to analyze healthy brains and provide no guarantee for images featuring lesions. Examples include, but are not limited to, algorithms for brain anatomy parcellation, tissue segmentation, and brain extraction. To solve this dilemma, we introduce the BraTS inpainting challenge. Here, the participants explore inpainting techniques to synthesize healthy brain scans from lesioned ones. The following manuscript contains the task formulation, dataset, and submission procedure. Later, it will be updated to summarize the findings of the challenge. The challenge is organized as part of the ASNR-BraTS MICCAI challenge.	公開日:2024-09-22 翻訳日:2024-11-09 15:13:22
# 教師なし要約の最近の動向 Recent Trends in Unsupervised Summarization ( http://arxiv.org/abs/2305.11231v2 ) ライセンス: Link先を確認	Mohammad Khosravani, Amine Trabelsi,	(参考訳) 教師なしの要約は、ラベル付きデータセットを必要とせずにモデルを要約する訓練を可能にする強力なテクニックである。このサーベイは、教師なし要約に使用される様々な手法とモデルをカバーしている。我々は、教師なし要約を実現するために用いられる抽出的、抽象的、ハイブリッドなモデルと戦略を網羅する。この調査の主な焦点は最近の研究であるが、過去の重要な研究についても紹介する。さらに分類学を導入し、教師なしトレーニングへのアプローチに基づいて異なる研究を分類する。最後に、現在のアプローチについて議論し、いくつかのデータセットと評価手法について述べる。 Unsupervised summarization is a powerful technique that enables training summarizing models without requiring labeled datasets. This survey covers different recent techniques and models used for unsupervised summarization. We cover extractive, abstractive, and hybrid models and strategies used to achieve unsupervised summarization. While the main focus of this survey is on recent research, we also cover some of the important previous research. We additionally introduce a taxonomy, classifying different research based on their approach to unsupervised training. Finally, we discuss the current approaches and mention some datasets and evaluation methods.	公開日:2024-09-26 翻訳日:2024-11-09 15:13:22
# 言語モデルに追従する: バイアス監査のためのシステムベンチマーク拡張 Keeping Up with the Language Models: Systematic Benchmark Extension for Bias Auditing ( http://arxiv.org/abs/2305.12620v2 ) ライセンス: Link先を確認	Ioana Baldini, Chhavi Yadav, Manish Nagireddy, Payel Das, Kush R. Varshney,	(参考訳) 言語モデル (LM) のバイアス監査は, LM が普及するにつれて注目されている。このように、バイアス監査のためのいくつかのベンチマークが提案されている。同時に、LMの急速な進化は、これらのベンチマークをすぐに無関係にすることができる。バイアス監査は、LMの脆性によってさらに複雑である: おそらくバイアスのある結果が観察された場合、それはモデルバイアスかモデル脆性によるものか? モデル自体を登録して、困難なままのバイアス監査データセットの構築を支援し、異なるタイプのモデルエラーを区別するバイアス測定を導入することを提案する。まず,NLI(BBNLI)の既存のバイアスベンチマークを,LM生成語彙の変動,逆フィルタリング,人間による検証の組み合わせを用いて拡張する。 BBNLI-nextは平均して最先端のNLIモデルの精度を95.3%から57.5%に下げる。次に、BBNLI-nextを用いて、ロバスト性とバイアスの相互作用を示す。現在のバイアススコアの欠点を指摘し、バイアスとモデルの脆さを考慮に入れたバイアス対策を提案する。第三に、BBNLI-nextは非生成モデルを念頭に設計されているにもかかわらず、新しいデータセットは、最先端のオープンソース生成LMのバイアスを明らかにすることが可能であることを示す。注: この研究に含まれるすべてのデータセットは英語で書かれており、米国中心の社会的偏見に対処している。効率的なNLP研究の精神において、この研究を行うためのモデルトレーニングや微調整は行われなかった。警告: 攻撃的なテキスト例を含む。 Bias auditing of language models (LMs) has received considerable attention as LMs are becoming widespread. As such, several benchmarks for bias auditing have been proposed. At the same time, the rapid evolution of LMs can make these benchmarks irrelevant in no time. Bias auditing is further complicated by LM brittleness: when a presumably biased outcome is observed, is it due to model bias or model brittleness? We propose enlisting the models themselves to help construct bias auditing datasets that remain challenging, and introduce bias measures that distinguish between different types of model errors. First, we extend an existing bias benchmark for NLI (BBNLI) using a combination of LM-generated lexical variations, adversarial filtering, and human validation. We demonstrate that the newly created dataset BBNLI-next is more challenging than BBNLI: on average, BBNLI-next reduces the accuracy of state-of-the-art NLI models from 95.3%, as observed by BBNLI, to a strikingly low 57.5%. Second, we employ BBNLI-next to showcase the interplay between robustness and bias: we point out shortcomings in current bias scores and propose bias measures that take into account both bias and model brittleness. Third, despite the fact that BBNLI-next was designed with non-generative models in mind, we show that the new dataset is also able to uncover bias in state-of-the-art open-source generative LMs. Note: All datasets included in this work are in English and they address US-centered social biases. In the spirit of efficient NLP research, no model training or fine-tuning was performed to conduct this research. Warning: This paper contains offensive text examples.	公開日:2024-09-25 翻訳日:2024-11-09 15:13:22
# GUARD: 安全な強化学習ベンチマーク GUARD: A Safe Reinforcement Learning Benchmark ( http://arxiv.org/abs/2305.13681v4 ) ライセンス: Link先を確認	Weiye Zhao, Yifan Sun, Feihan Li, Rui Chen, Ruixuan Liu, Tianhao Wei, Changliu Liu,	(参考訳) 試行錯誤の性質のため、そのようなエラーが許容できない自律運転、人間とロボットのインタラクション、ロボット操作など、安全クリティカルな現実世界のアプリケーションにRLアルゴリズムを適用することは、一般的に困難である。近年、安全なRL(すなわち制約付きRL)は、制約を満たすとともに、エージェントが環境を探索する文献に急速に現れている。アルゴリズムとタスクの多様性のため、既存の安全なRLアルゴリズムを比較するのは難しい。このギャップを埋めるために、一般化されたSAfe強化学習ベンチマークであるGUARDを紹介します。 GUARDは既存のベンチマークと比べていくつかの利点がある。まず、GUARDは様々なRLエージェント、タスク、安全制約仕様を備えた一般化されたベンチマークである。第2に、GUARDは自己完結した実装で最先端の安全なRLアルゴリズムを包括的にカバーしている。第3に、GUARDはタスクやアルゴリズムで高度にカスタマイズできる。本稿では,GUARDを用いた各種タスク設定における最先端安全RLアルゴリズムの比較を行い,今後の作業が構築できるベースラインを確立する。 Due to the trial-and-error nature, it is typically challenging to apply RL algorithms to safety-critical real-world applications, such as autonomous driving, human-robot interaction, robot manipulation, etc, where such errors are not tolerable. Recently, safe RL (i.e. constrained RL) has emerged rapidly in the literature, in which the agents explore the environment while satisfying constraints. Due to the diversity of algorithms and tasks, it remains difficult to compare existing safe RL algorithms. To fill that gap, we introduce GUARD, a Generalized Unified SAfe Reinforcement Learning Development Benchmark. GUARD has several advantages compared to existing benchmarks. First, GUARD is a generalized benchmark with a wide variety of RL agents, tasks, and safety constraint specifications. Second, GUARD comprehensively covers state-of-the-art safe RL algorithms with self-contained implementations. Third, GUARD is highly customizable in tasks and algorithms. We present a comparison of state-of-the-art safe RL algorithms in various task settings using GUARD and establish baselines that future work can build on.	公開日:2024-09-24 翻訳日:2024-11-09 15:02:22
# フォノンによる巨大物体を持つ空間量子重ね合わせの極限 Limit on spatial quantum superpositions with massive objects due to phonons ( http://arxiv.org/abs/2305.15230v2 ) ライセンス: Link先を確認	Carsten Henkel, Ron Folman,	(参考訳) 巨大な物体を実空間の異なる位置の重ね合わせに持ち込むことは長年の目標であり、新しい状態における量子理論を確かめるだけでなく、重力との界面を探索することでもある。主な課題は通常、大きな物体の波動関数を統計的混合に分解する環境場や粒子による力や散乱によって生じると考えられている。環境からの隔離の改善によって除去できないデコヒーレンスチャネルを公表する。これは物体内の音波から派生したもので、任意の分裂過程の一部として励起され、部分的な「ヴェルチャー・ウェッグ」情報を運ぶ。これにより、大きな物体の将来の空間重ね合わせに厳密な制約が課される。 It has been a long-standing goal to bring massive objects into a superposition of different locations in real space, not only to confirm quantum theory in new regimes, but also to explore the interface with gravity. The main challenge is usually thought to arise from forces or scattering due to environmental fields and particles that decohere the large object's wave function into a statistical mixture. We unveil a decoherence channel which cannot be eliminated by improved isolation from the environment. It originates from sound waves within the object, which are excited as part of any splitting process and carry partial "Welcher Weg" information. This puts stringent constraints on future spatial superpositions of large objects.	公開日:2024-09-27 翻訳日:2024-11-09 15:02:22
# Taylorformer: 時系列を含むランダムプロセスの確率論的モデリング Taylorformer: Probabilistic Modelling for Random Processes including Time Series ( http://arxiv.org/abs/2305.19141v2 ) ライセンス: Link先を確認	Omer Nivron, Raghul Parthipan, Damon J. Wischik,	(参考訳) 時系列などのランダムなプロセスに対してTaylorformerを提案する。その2つの重要な構成要素は以下のとおりである。 1) ニューラルネットワークに基づく確率モデルにおけるTaylor近似(力学系で使用される)を適応するLocalTaylorラッパー 2) ガウス過程の平均予測が文脈データの線形滑らか化にどのように影響するかに着想を得たMHA-Xアテンションブロック。 Taylorformerは、メタラーニング1D機能のような5/6の古典的なニューラル・プロセスのタスクで、ログライクな点では最先端のタスクを上回り、電気、油温、為替レートなどの予測タスクでは、少なくとも14倍のMSEを改善している。 Taylorformerは、一貫した確率過程を近似し、不確実性を考慮した予測を提供する。私たちのコードは補足材料で提供されます。 We propose the Taylorformer for random processes such as time series. Its two key components are: 1) the LocalTaylor wrapper which adapts Taylor approximations (used in dynamical systems) for use in neural network-based probabilistic models, and 2) the MHA-X attention block which makes predictions in a way inspired by how Gaussian Processes' mean predictions are linear smoothings of contextual data. Taylorformer outperforms the state-of-the-art in terms of log-likelihood on 5/6 classic Neural Process tasks such as meta-learning 1D functions, and has at least a 14\% MSE improvement on forecasting tasks, including electricity, oil temperatures and exchange rates. Taylorformer approximates a consistent stochastic process and provides uncertainty-aware predictions. Our code is provided in the supplementary material.	公開日:2024-09-23 翻訳日:2024-11-09 15:02:22
# プライバシ保護による会計認証:ユニバーサルログインのためのLarchシステム Accountable authentication with privacy protection: The Larch system for universal login ( http://arxiv.org/abs/2305.19241v8 ) ライセンス: Link先を確認	Emma Dauterman, Danny Lin, Henry Corrigan-Gibbs, David Mazières,	(参考訳) クレデンシャル妥協は検出が難しく、緩和が難しい。この問題に対処するために,強力なセキュリティとプライバシ特性を備えた説明可能な認証フレームワークであるlarchを提案する。 Larchはユーザのプライバシを保護し、larchログサーバがすべての認証を正しく記録することを保証する。具体的には、ユーザのデバイスを侵害した攻撃者は、ログに証拠を作成せずに認証することができず、ログは、ユーザが認証しているWebサービス(サードパーティ)を学習することはできない。迅速な採用を実現するため、larchはFIDO2、TOTP、パスワードベースのログインをサポートするサードパーティと後方互換性がある。さらに、larchは、ユーザがすでに期待しているセキュリティとプライバシを劣化させません。ログサーバは、ユーザに代わって認証することができません。 FIDO2、TOTP、パスワードベースのログインのためのlarchを実装している。 4コアのクライアントと8コアのログサーバが与えられた後、larchによる認証はFIDO2で150ms、TOTPで91ms、パスワードで74ms(TOTPで1.23s)。 Credential compromise is hard to detect and hard to mitigate. To address this problem, we present larch, an accountable authentication framework with strong security and privacy properties. Larch protects user privacy while ensuring that the larch log server correctly records every authentication. Specifically, an attacker who compromises a user's device cannot authenticate without creating evidence in the log, and the log cannot learn which web service (relying party) the user is authenticating to. To enable fast adoption, larch is backwards-compatible with relying parties that support FIDO2, TOTP, and password-based login. Furthermore, larch does not degrade the security and privacy a user already expects: the log server cannot authenticate on behalf of a user, and larch does not allow relying parties to link a user across accounts. We implement larch for FIDO2, TOTP, and password-based login. Given a client with four cores and a log server with eight cores, an authentication with larch takes 150ms for FIDO2, 91ms for TOTP, and 74ms for passwords (excluding preprocessing, which takes 1.23s for TOTP).	公開日:2024-09-23 翻訳日:2024-11-09 15:02:22
# 絡み合うコンパスとしてのグリュナイゼンパラメータとヘルマン・ファインマンの定理の分解 Grüneisen parameter as an entanglement compass and the breakdown of the Hellmann-Feynman theorem ( http://arxiv.org/abs/2306.00566v2 ) ライセンス: Link先を確認	Lucas Squillante, Luciano S. Ricco, Aniekan Magnus Ukpong, Roberto E. Lagos-Monaco, Antonio C. Seridonio, Mariano de Souza,	(参考訳) Gr\"uneisen ratio $\Gamma$, すなわち、熱膨張と比熱の比の特異部分は、有限のT$と量子臨界点(QCP)の両方を探索するために広く用いられている。真の量子相転移(QPT)では、熱ゆらぎが欠如しており、熱力学的な$\Gamma$は使用できない。チューニングパラメータ $\lambda$ の関数として絡み合いを計算する$\Gamma$ の量子アナログを提案し、基底状態エネルギーが非直線的に$\lambda$ に依存するシステムに対してのみ QPT が実行されることを示す。さらに、任意のQCPにおける熱力学極限におけるヘルマン・ファインマンの定理の分解を実証する。本稿では,逆場をもつ量子1次元イジングモデルとケーンの量子コンピュータを用いたアプローチを紹介する。ダイナミクスの減速と、QCP/QPTに近い「質量の創出」についても論じる。 The Gr\"uneisen ratio $\Gamma$, i.e., the singular part of the ratio of thermal expansion to the specific heat, has been broadly employed to explore both finite-$T$ and quantum critical points (QCPs). For a genuine quantum phase transition (QPT), thermal fluctuations are absent and thus the thermodynamic $\Gamma$ cannot be employed. We propose a quantum analogue to $\Gamma$ that computes entanglement as a function of a tuning parameter $\lambda$ and show that QPTs take place only for systems in which the ground-state energy depends on $\lambda$ non-linearly. Furthermore, we demonstrate the breakdown of the Hellmann-Feynman theorem in the thermodynamic limit at any QCP. We showcase our approach using the quantum 1D Ising model with transverse field and Kane's quantum computer. The slowing down of the dynamics and thus the "creation of mass" close to any QCP/QPT is also discussed.	公開日:2024-09-25 翻訳日:2024-11-09 15:02:22
# 弦理論における有限エンタングルメントエントロピー Finite Entanglement Entropy in String Theory ( http://arxiv.org/abs/2306.00990v2 ) ライセンス: Link先を確認	Atish Dabholkar, Upamanyu Moitra,	(参考訳) 我々は、10次元のタイプII弦理論における1ループの量子エンタングルメントエントロピーを、任意の奇数の整数$N > 1$で知られている$\mathbb{R}^2/\mathbb{Z}_N$の弦オービフォールドに対する属1分割関数を解析的に$N$で連続させることにより解析する。オービフォールド分割関数に対するタキオン寄与は、物理的領域 $0 < N \leq 1$ において有限である式に適切にまとめ、解析的に連続し、エンタングルメントエントロピーに対する有限で計算可能な解が得られることを示す。情報パラドックス,量子重力,ホログラフィーにおけるエンタングルメントエントロピーの有限性の影響について論じる。 We analyze the one-loop quantum entanglement entropy in ten-dimensional Type-II string theory using the orbifold method by analytically continuing in $N$ the genus-one partition function for string orbifolds on $\mathbb{R}^2/\mathbb{Z}_N$ conical spaces known for all odd integers $N > 1$. We show that the tachyonic contributions to the orbifold partition function can be appropriately summed and analytically continued to an expression that is finite in the physical region $0 < N \leq 1$ resulting in a finite and calculable answer for the entanglement entropy. We discuss the implications of the finiteness of the entanglement entropy for the information paradox, quantum gravity, and holography.	公開日:2024-09-24 翻訳日:2024-11-09 15:02:22
# SelFLoc: 大規模クラウドによる位置認識のための選択的特徴融合 SelFLoc: Selective Feature Fusion for Large-scale Point Cloud-based Place Recognition ( http://arxiv.org/abs/2306.01205v3 ) ライセンス: Link先を確認	Qibo Qiu, Wenxiao Wang, Haochao Ying, Dingkun Liang, Haiming Gao, Xiaofei He,	(参考訳) ポイントクラウドベースの位置認識は、特にグローバルな位置センサがアクセスできない場合、モバイルロボットや自動運転車にとって不可欠である。物体や建物の表面にはLiDARの点が散在しており、異なる軸に沿って強い形状の先行している。特定の軸に沿ったメッセージパッシングを改善するために,本論文の主なコントリビューションのひとつとして,スタック型非対称畳み込みブロック(SACB)が設計されている。総合的な実験により、SACBが採用した非対称な畳み込みとその戦略が、ポイントクラウドの特徴のより効果的な表現に寄与できることが示されている。そこで,SFFB (Selective Feature Fusion Block) は,特定の鍵領域の局所的特徴を選択的に増強し,融合前の特徴を整列させる。 SACBとSFFBは、SelFLocと呼ばれるポイントクラウドベースの位置認識のための堅牢で正確なアーキテクチャを構築するために結合される。比較実験の結果,SelFLoc は,平均リコール@1。 Point cloud-based place recognition is crucial for mobile robots and autonomous vehicles, especially when the global positioning sensor is not accessible. LiDAR points are scattered on the surface of objects and buildings, which have strong shape priors along different axes. To enhance message passing along particular axes, Stacked Asymmetric Convolution Block (SACB) is designed, which is one of the main contributions in this paper. Comprehensive experiments demonstrate that asymmetric convolution and its corresponding strategies employed by SACB can contribute to the more effective representation of point cloud feature. On this basis, Selective Feature Fusion Block (SFFB), which is formed by stacking point- and channel-wise gating layers in a predefined sequence, is proposed to selectively boost salient local features in certain key regions, as well as to align the features before fusion phase. SACBs and SFFBs are combined to construct a robust and accurate architecture for point cloud-based place recognition, which is termed SelFLoc. Comparative experimental results show that SelFLoc achieves the state-of-the-art (SOTA) performance on the Oxford and other three in-house benchmarks with an improvement of 1.6 absolute percentages on mean average recall@1.	公開日:2024-09-24 翻訳日:2024-11-09 15:02:22
# 漸近テンソルランクの離散性 Discreteness of asymptotic tensor ranks ( http://arxiv.org/abs/2306.01718v3 ) ライセンス: Link先を確認	Jop Briët, Matthias Christandl, Itai Leigh, Amir Shpilka, Jeroen Zuiddam,	(参考訳) テンソルパラメータは、しばしば「漸近的」テンソルパラメータと呼ばれ、代数的複雑性理論(高速行列乗算アルゴリズムの構築)、量子情報(絡み合いコストと蒸留可能な絡み合い)、加法的コンビネータ(キャップセット、サンフラワーフリーセットなど)を含むいくつかの領域において中心的な役割を果たす。例えば、漸近テンソルランク、漸近スライスランク、漸近サブランクである。最近の研究 (Costa-Dalai, Blatter-Draisma-Rupniewski, Christandl-Gesmundo-Zuiddam) では、そのようなテンソルパラメータの値における離散性(累積点を持たない)や「ギャップ」の概念が研究されている。我々は、次数3テンソルの漸近テンソルパラメータに対する一般的な離散性定理を証明し、これを、(1)任意の有限体(実際、任意の体における係数の有限集合)、漸近部分ランクおよび漸近スライスランクが累積点を持たないこと、(2)複素数上では、漸近スライスランクが累積点を持たないことを証明するために利用する。我々のアプローチの中心はテンソルの漸近部分ランクの2つの新しい一般下界であり、テンソルがどれだけ対角化できるかを測定する。最初の下界は、任意の簡潔な3次元テンソルの漸近部分ランクは、少なくとも最小次元の立方根であると述べている。 2番目の下界は、「十分狭く」(他の2つよりも1次元がかなり小さい)任意の簡潔な3つのテンソルは、最大漸近部分ランクを持つと述べている。我々の証明は、行列部分空間の最大階数に対する新しい下界に依存し、3つの異なる方向に3つのテンソルをスライスすることで得られる。任意の簡潔なテンソルに対して、そのような最大ランクの任意の2つの積は大きいものでなければならないことを証明し、その結果、常に大きな最大ランクを持つ2つの異なる方向が存在する。 Tensor parameters that are amortized or regularized over large tensor powers, often called "asymptotic" tensor parameters, play a central role in several areas including algebraic complexity theory (constructing fast matrix multiplication algorithms), quantum information (entanglement cost and distillable entanglement), and additive combinatorics (bounds on cap sets, sunflower-free sets, etc.). Examples are the asymptotic tensor rank, asymptotic slice rank and asymptotic subrank. Recent works (Costa-Dalai, Blatter-Draisma-Rupniewski, Christandl-Gesmundo-Zuiddam) have investigated notions of discreteness (no accumulation points) or "gaps" in the values of such tensor parameters. We prove a general discreteness theorem for asymptotic tensor parameters of order-three tensors and use this to prove that (1) over any finite field (and in fact any finite set of coefficients in any field), the asymptotic subrank and the asymptotic slice rank have no accumulation points, and (2) over the complex numbers, the asymptotic slice rank has no accumulation points. Central to our approach are two new general lower bounds on the asymptotic subrank of tensors, which measures how much a tensor can be diagonalized. The first lower bound says that the asymptotic subrank of any concise three-tensor is at least the cube-root of the smallest dimension. The second lower bound says that any concise three-tensor that is "narrow enough" (has one dimension much smaller than the other two) has maximal asymptotic subrank. Our proofs rely on new lower bounds on the maximum rank in matrix subspaces that are obtained by slicing a three-tensor in the three different directions. We prove that for any concise tensor, the product of any two such maximum ranks must be large, and as a consequence there are always two distinct directions with large max-rank.	公開日:2024-09-24 翻訳日:2024-11-09 15:02:22
# 線形文脈による探索のインセンティブと組合せ行動 Incentivizing Exploration with Linear Contexts and Combinatorial Actions ( http://arxiv.org/abs/2306.01990v3 ) ライセンス: Link先を確認	Mark Sellke,	(参考訳) 我々は、腕の選択を推奨とみなし、ベイズ的インセンティブと互換性を持たなければならない、インセンティブ付きバンディット探索の研究を前進させる。最近の研究は、十分な初期サンプルを収集した後、人気のあるトンプソンサンプリングアルゴリズムがインセンティブ互換になる、という一定の独立性の仮定の下で示されている。線形包帯に対してこの結果の類似性を与え、そこでは前者の独立性を自然凸条件に置き換える。これにより、高次元の行動空間における効率的かつ後悔に満ちたインセンティブ付き探索の可能性が開ける。半帯域モデルでは、初期データ収集のトンプソン前サンプリングフェーズにおけるサンプルの複雑さも改善する。 We advance the study of incentivized bandit exploration, in which arm choices are viewed as recommendations and are required to be Bayesian incentive compatible. Recent work has shown under certain independence assumptions that after collecting enough initial samples, the popular Thompson sampling algorithm becomes incentive compatible. We give an analog of this result for linear bandits, where the independence of the prior is replaced by a natural convexity condition. This opens up the possibility of efficient and regret-optimal incentivized exploration in high-dimensional action spaces. In the semibandit model, we also improve the sample complexity for the pre-Thompson sampling phase of initial data collection.	公開日:2024-09-24 翻訳日:2024-11-09 15:02:22
# 合成能動推論エージェントの実現その2: 変分メッセージ更新 Realising Synthetic Active Inference Agents, Part II: Variational Message Updates ( http://arxiv.org/abs/2306.02733v3 ) ライセンス: Link先を確認	Thijs van de Laar, Magnus Koudahl, Bert de Vries,	(参考訳) 自由エネルギー原則(FEP)は、(生物学的)エージェントを、環境の生成モデルに関する変動自由エネルギー(FE)を最小化するものとして記述している。アクティブ推論(英: Active Inference、AIF)は、エージェントが期待されるFE目標を最小化することによって環境を探索し、活用する方法を記述するFEPのまとめである。 2つの関連論文において、自由形式のForney-style Factor Graphs (FFGs) 上のメッセージパッシングによるAIFのスケーラブルでエピステマティックなアプローチについて述べる。共用紙(第1部)は、AFFのFE目標を視覚的に(一般化)する制約付きFFG(CFFG)表記法を導入する。現在の論文(パートII)は、変分法によりCFFG上のFE目的を最小化(一般化)するメッセージパッシングアルゴリズムを導出する。シミュレーションされたBetheと一般化されたFEエージェントの比較は、合成AIFへのメッセージパッシングアプローチがT迷路ナビゲーションタスクにおいてどのようにててんかん行動を引き起こすかを示している。 T迷路シミュレーションの拡張 1)目標統計の学習、及び 2)マルチエージェントバーゲティング設定は、このアプローチがノードの再利用と代替設定の更新をいかに促すかを示している。合成AIFエージェントの完全なメッセージパッシングアカウントにより、モデル間でのメッセージ更新を導出し再利用し、合成AIFの産業的応用に近づくことができる。 The Free Energy Principle (FEP) describes (biological) agents as minimising a variational Free Energy (FE) with respect to a generative model of their environment. Active Inference (AIF) is a corollary of the FEP that describes how agents explore and exploit their environment by minimising an expected FE objective. In two related papers, we describe a scalable, epistemic approach to synthetic AIF, by message passing on free-form Forney-style Factor Graphs (FFGs). A companion paper (part I) introduces a Constrained FFG (CFFG) notation that visually represents (generalised) FE objectives for AIF. The current paper (part II) derives message passing algorithms that minimise (generalised) FE objectives on a CFFG by variational calculus. A comparison between simulated Bethe and generalised FE agents illustrates how the message passing approach to synthetic AIF induces epistemic behaviour on a T-maze navigation task. Extension of the T-maze simulation to 1) learning goal statistics, and 2) a multi-agent bargaining setting, illustrate how this approach encourages reuse of nodes and updates in alternative settings. With a full message passing account of synthetic AIF agents, it becomes possible to derive and reuse message updates across models and move closer to industrial applications of synthetic AIF.	公開日:2024-09-26 翻訳日:2024-11-09 15:02:22
# 最適木アンサンブルの計算について On Computing Optimal Tree Ensembles ( http://arxiv.org/abs/2306.04423v2 ) ライセンス: Link先を確認	Christian Komusiewicz, Pascal Kunz, Frank Sommer, Manuel Sorge,	(参考訳) ランダム林や、より一般的には(決定的)ノブレイクダッシュ-(ツリーアンサンブル)は、分類と回帰の方法として広く使われている。最近のアルゴリズムの進歩は、そのサイズや深さなどの様々な測定に最適な決定木を計算することができる。我々は、このような樹木アンサンブルの研究を意識しておらず、この領域に貢献することを目指している。主に、2つの新しいアルゴリズムと対応する下位境界を提供する。まず、決定木に対するトラクタビリティーの結果を大幅に改善することができる: トレーニングデータセットとサイズが有界な$S \in \mathbb{R}$を与えられた場合、最大で$S$でツリーアンサンブルを計算し、データを正しく分類するアルゴリズムを得る。このアルゴリズムは$(4\delta D S)^S \cdot poly$-timeで実行され、$D$は最大のドメインサイズ、$\delta$は2つの異なる例、$n$は入力例、$poly$は入力サイズの多項式である。決定木、すなわち、サイズ1のアンサンブルに対して、$(\delta D s)^s \cdot poly$ のランニング時間を得る。これらのアルゴリズムを実現するために,実践的な実装に期待できる目撃者木技術を導入する。第2に、決定木にうまく適用された動的プログラミングは、木アンサンブルにも有効である可能性を示し、$\ell^n \cdot poly$-timeアルゴリズムを提供し、$\ell$は木数である。最後に、決定木と木アンサンブルのトレーニングデータセットの分類に必要なカット数を比較し、アンサンブルが木数の増加に指数関数的に少ないカットを必要とすることを示す。 Random forests and, more generally, (decision\nobreakdash-)tree ensembles are widely used methods for classification and regression. Recent algorithmic advances allow to compute decision trees that are optimal for various measures such as their size or depth. We are not aware of such research for tree ensembles and aim to contribute to this area. Mainly, we provide two novel algorithms and corresponding lower bounds. First, we are able to carry over and substantially improve on tractability results for decision trees: We obtain an algorithm that, given a training-data set and an size bound $S \in \mathbb{R}$, computes a tree ensemble of size at most $S$ that classifies the data correctly. The algorithm runs in $(4\delta D S)^S \cdot poly$-time, where $D$ the largest domain size, $\delta$ is the largest number of features in which two examples differ, $n$ the number of input examples, and $poly$ a polynomial of the input size. For decision trees, that is, ensembles of size 1, we obtain a running time of $(\delta D s)^s \cdot poly$, where $s$ is the size of the tree. To obtain these algorithms, we introduce the witness-tree technique, which seems promising for practical implementations. Secondly, we show that dynamic programming, which has been applied successfully to computing decision trees, may also be viable for tree ensembles, providing an $\ell^n \cdot poly$-time algorithm, where $\ell$ is the number of trees. Finally, we compare the number of cuts necessary to classify training data sets for decision trees and tree ensembles, showing that ensembles may need exponentially fewer cuts for increasing number of trees.	公開日:2024-09-24 翻訳日:2024-11-09 15:02:22
# Etsy Searchにおける統一埋め込みに基づくパーソナライズされた検索 Unified Embedding Based Personalized Retrieval in Etsy Search ( http://arxiv.org/abs/2306.04833v2 ) ライセンス: Link先を確認	Rishikesh Jha, Siddharth Subramaniyam, Ethan Benjamin, Thrivikrama Taula,	(参考訳) 埋め込みベースのニューラル検索は、末尾クエリの製品検索でしばしば発生するセマンティックギャップ問題に対処するための一般的なアプローチである。対照的に、一般的なクエリにはコンテキストが欠如しており、ユーザの過去のインタラクションから追加のコンテキストが役に立つような、幅広い意図がある。本稿では、セマンティックギャップ問題と、パーソナライズされたセマンティック検索のためのエンド・ツー・エンド・トレーニングモデルの両方に対処する新しいアプローチを共有する。グラフ, トランスフォーマー, 項ベースの埋め込みを終端から終端まで組み込んだ統合埋め込みモデルを学習し, 性能と効率の最適なトレードオフのための設計選択を共有することを提案する。我々は、機能工学、ハードネガティブサンプリング戦略、トランスフォーマーモデルの適用に関する知見を共有し、新しい事前学習戦略や、検索関連性を改善し、そのようなモデルを産業規模で展開するための他の手法を含む。我々のパーソナライズされた検索モデルは、検索購入率の5.58%、サイト全体のコンバージョン率の2.63%、複数のA/Bテストにまたがるライブトラフィックにおいて、検索体験を著しく改善する。 Embedding-based neural retrieval is a prevalent approach to address the semantic gap problem which often arises in product search on tail queries. In contrast, popular queries typically lack context and have a broad intent where additional context from users historical interaction can be helpful. In this paper, we share our novel approach to address both: the semantic gap problem followed by an end to end trained model for personalized semantic retrieval. We propose learning a unified embedding model incorporating graph, transformer and term-based embeddings end to end and share our design choices for optimal tradeoff between performance and efficiency. We share our learnings in feature engineering, hard negative sampling strategy, and application of transformer model, including a novel pre-training strategy and other tricks for improving search relevance and deploying such a model at industry scale. Our personalized retrieval model significantly improves the overall search experience, as measured by a 5.58% increase in search purchase rate and a 2.63% increase in site-wide conversion rate, aggregated across multiple A/B tests - on live traffic.	公開日:2024-09-25 翻訳日:2024-11-09 15:02:22
# Mnemonic Codeによるワンショット機械の学習 One-Shot Machine Unlearning with Mnemonic Code ( http://arxiv.org/abs/2306.05670v2 ) ライセンス: Link先を確認	Tomoya Yamashita, Masanori Yamada, Takashi Shibata,	(参考訳) 人工知能(AI)アプリケーションに固有の倫理的およびプライバシー上の問題は、ディープラーニングの急速な普及に対する懸念が高まっている。機械学習(MU)は、トレーニングされたAIモデルを望ましくないトレーニングデータを忘れさせることによって、これらの問題に対処する研究領域である。残念なことに、既存のMUメソッドの多くは、忘れるのにかなりの時間と計算コストを必要とする。したがって、これらの手法を実用的なデータセットや高度なアーキテクチャ、例えば ImageNet や Transformer に適用することは、しばしば困難である。この問題に対処するために,軽量かつ効率的なMU法を提案する。本手法は, 忘れる対象に敏感なモデルパラメータを同定し, モデルパラメータに摂動を追加する。本稿では,FIM(Fisher Information Matrix)を計算し,その感度パラメータを同定する。このアプローチでは、忘れるのに時間を要する追加のトレーニングは必要ありません。さらに,Mnemonic codeと呼ばれるクラス固有のランダム信号を導入し,FIM計算のコストを削減する。本手法では, ムネモニック符号を用いてモデルを訓練し, ムネモニック符号を少数使用してFIMを計算し, 効率的に摂動し, 忘れる。包括的実験により,本手法は既存のMU法よりも高速で,忘れやすいことが示された。さらに,本手法は,より実用的なデータセットや高度なアーキテクチャに拡張可能であることを示す。 Ethical and privacy issues inherent in artificial intelligence (AI) applications have been a growing concern with the rapid spread of deep learning. Machine unlearning (MU) is the research area that addresses these issues by making a trained AI model forget about undesirable training data. Unfortunately, most existing MU methods incur significant time and computational costs for forgetting. Therefore, it is often difficult to apply these methods to practical datasets and sophisticated architectures, e.g., ImageNet and Transformer. To tackle this problem, we propose a lightweight and effective MU method. Our method identifies the model parameters sensitive to the forgetting targets and adds perturbation to such model parameters. We identify the sensitive parameters by calculating the Fisher Information Matrix (FIM). This approach does not require time-consuming additional training for forgetting. In addition, we introduce class-specific random signals called mnemonic code to reduce the cost of FIM calculation, which generally requires the entire training data and incurs significant computational costs. In our method, we train the model with mnemonic code; when forgetting, we use a small number of mnemonic codes to calculate the FIM and get the effective perturbation for forgetting. Comprehensive experiments demonstrate that our method is faster and better at forgetting than existing MU methods. Furthermore, we show that our method can scale to more practical datasets and sophisticated architectures.	公開日:2024-09-25 翻訳日:2024-11-09 15:02:22
# CCE:信頼度制御によるロボットナビゲーションのための効率的なスパースリワード政策学習 CCE: Sample Efficient Sparse Reward Policy Learning for Robotic Navigation via Confidence-Controlled Exploration ( http://arxiv.org/abs/2306.06192v8 ) ライセンス: Link先を確認	Bhrij Patel, Kasun Weerakoon, Wesley A. Suttle, Alec Koppel, Brian M. Sadler, Tianyi Zhou, Amrit Singh Bedi, Dinesh Manocha,	(参考訳) 本稿では,ロボットナビゲーションなどのスパース報酬設定のための強化学習(RL)アルゴリズムのトレーニングサンプル効率を高めるための新しい探索手法である信頼性制御探索(CCE)を紹介する。スパース報酬はRLで一般的であり、設計と実装に便利であるが、探索の課題のために対処するのが通常困難である。既存の手法では、探索課題に対処するための正規化ベースの手法が展開されている。しかし、正規化は報酬関数自体を変更するため、探索と搾取のバランスを特徴付けることは困難である。既存の文献における正規化に基づくアプローチとは対照的に、我々のアプローチであるCCEは、勾配推定と政策エントロピーの間の新しい関係に基づいている。 CCEは、探索を制御するために訓練中に使用される勾配更新のサンプル数を動的に調整する。興味深いことに、CCEは既存のオン・ポリティクスとオフ・ポリティクスのRL手法の両方に適用でき、この手法を3つの一般的なRL手法(REINFORCE, Proximal Policy Optimization (PPO),Soft Actor-Critic (SAC))に対して実証的に有効性を示す。我々は,サンプル予算を制約する場合に,一定の軌道長とエントロピー正規化を用いる従来の手法よりもCCEの方が優れる実世界のシミュレーション実験を通して実証する。固定されたサンプル予算では、CCEは航法成功率18\%、航法パス長20-38\%、高架コスト9.32\%を達成している。さらに,CCEをClearpath Huskyロボットに統合し,複雑な屋外環境に適用可能であることを示す。 We introduce Confidence-Controlled Exploration (CCE), a novel exploration scheme designed to enhance the training sample efficiency of reinforcement learning (RL) algorithms for sparse reward settings such as robot navigation. Sparse rewards are common in RL and convenient to design and implement, but typically hard to deal with due to the challenges of exploration. Existing methods deploy regularization-based methods to deal with the exploration challenges. However, it is hard to characterize the balance between exploration and exploitation because regularization modifies the reward function itself, hence changing the objective we are optimizing for. In contrast to regularization-based approaches in the existing literature, our approach, CCE, is based on a novel relationship we provide between gradient estimation and policy entropy. CCE dynamically adjusts the number of samples of the gradient update used during training to control exploration. Interestingly, CCE can be applied to both existing on-policy and off-policy RL methods, which we demonstrate by empirically validating its efficacy on three popular RL methods: REINFORCE, Proximal Policy Optimization (PPO), and Soft Actor-Critic (SAC) for goal-reaching robotic navigation tasks. We demonstrate through simulated and real-world experiments that CCE outperforms conventional methods that employ constant trajectory lengths and entropy regularization when constraining the sample budget. For a fixed sample budget, CCE achieves an 18\% increase in navigation success rate, a 20-38\% reduction in navigation path length, and a 9.32\% decrease in elevation costs. Furthermore, we showcase the versatility of CCE by integrating it with the Clearpath Husky robot, illustrating its applicability in complex outdoor environments.	公開日:2024-09-24 翻訳日:2024-11-09 15:02:22
# 雑音を考慮した自己教師付き学習と効率的なエンコーダによる時系列符号化の改善 Improving Time Series Encoding with Noise-Aware Self-Supervised Learning and an Efficient Encoder ( http://arxiv.org/abs/2306.06579v2 ) ライセンス: Link先を確認	Duy A. Nguyen, Trang H. Tran, Huy Hieu Pham, Phi Le Nguyen, Lam M. Nguyen,	(参考訳) 本研究では,自己教師付き手法を用いた時系列表現学習問題について検討する。コントラスト学習はこの分野でよく知られており、シリーズから情報を抽出し、タスクに適した表現を生成するための強力な方法である。時系列の特徴を捉える能力にもかかわらず、これらの手法は、しばしば重要な要因である、この種のデータに固有のノイズを見落としている。さらに、効率的な軽量エンコーダアーキテクチャの開発には注目すべき注意が払われていない。本研究は,自然時系列における雑音波信号の存在を考慮し,一貫した表現学習を促進する革新的な学習戦略を提案することによって,これらのギャップに対処する。さらに,インセプションブロック内に拡張畳み込みを組み込んだエンコーダアーキテクチャを提案する。実験結果から, 予測, 分類, 異常検出など, 様々なタスクにおいて, 最先端のアプローチを一貫して上回る結果が得られた。特に,本手法はUCRデータセットの分類の3分の2以上で上位にランクされ,第2のアプローチと比較してパラメータの40%しか利用されていない。 CoInceptionフレームワークのソースコードはhttps://github.com/anhduy0911/CoInception.comからアクセスできます。 In this work, we investigate the time series representation learning problem using self-supervised techniques. Contrastive learning is well-known in this area as it is a powerful method for extracting information from the series and generating task-appropriate representations. Despite its proficiency in capturing time series characteristics, these techniques often overlook a critical factor - the inherent noise in this type of data, a consideration usually emphasized in general time series analysis. Moreover, there is a notable absence of attention to developing efficient yet lightweight encoder architectures, with an undue focus on delivering contrastive losses. Our work address these gaps by proposing an innovative training strategy that promotes consistent representation learning, accounting for the presence of noise-prone signals in natural time series. Furthermore, we propose an encoder architecture that incorporates dilated convolution within the Inception block, resulting in a scalable and robust network with a wide receptive field. Experimental findings underscore the effectiveness of our method, consistently outperforming state-of-the-art approaches across various tasks, including forecasting, classification, and abnormality detection. Notably, our method attains the top rank in over two-thirds of the classification UCR datasets, utilizing only 40% of the parameters compared to the second-best approach. Our source code for CoInception framework is accessible at https://github.com/anhduy0911/CoInception.	公開日:2024-10-05 翻訳日:2024-11-09 15:02:22
# 近似制約最適化のための自己教師付きEquality Embedded Deep Lagrange Dual Self-supervised Equality Embedded Deep Lagrange Dual for Approximate Constrained Optimization ( http://arxiv.org/abs/2306.06674v5 ) ライセンス: Link先を確認	Minsoo Kim, Hongseok Kim,	(参考訳) 従来の解法はしばしば、特に大規模かつ時間クリティカルな問題において、制約付き最適化のために計算コストがかかる。これにより、ニューラルネットワーク(NN)を高速な最適解近似器として使用することへの関心が高まっているが、NNに制約を組み込むことは難しい。本稿では,ラベルを使わずに最適な解を求めるフレームワークであるDeepLDE(DeepLDE)を提案する。実現可能なソリューションを確保するため、NNに等価性制約を組み込み、未等式制約を課すために原始双対法を用いてNNを訓練する。さらに,DeepLDEの収束性を証明し,本手法だけでは等式埋め込みの助けなしには等式制約を保証できないことを示す。コンベックス,非凸,AC最適電力流(AC-OPF)問題に関するシミュレーション結果から,提案したDeepLDEはNNベースの全アプローチの中で最小の最適性ギャップを達成でき,かつ常に実現可能な解を確保できることを示す。さらに,提案手法の計算時間はDC3の約5～250倍であり,制約付き凸の解法,非凸最適化,AC-OPFの解法が提案されている。 Conventional solvers are often computationally expensive for constrained optimization, particularly in large-scale and time-critical problems. While this leads to a growing interest in using neural networks (NNs) as fast optimal solution approximators, incorporating the constraints with NNs is challenging. In this regard, we propose deep Lagrange dual with equality embedding (DeepLDE), a framework that learns to find an optimal solution without using labels. To ensure feasible solutions, we embed equality constraints into the NNs and train the NNs using the primal-dual method to impose inequality constraints. Furthermore, we prove the convergence of DeepLDE and show that the primal-dual learning method alone cannot ensure equality constraints without the help of equality embedding. Simulation results on convex, non-convex, and AC optimal power flow (AC-OPF) problems show that the proposed DeepLDE achieves the smallest optimality gap among all the NN-based approaches while always ensuring feasible solutions. Furthermore, the computation time of the proposed method is about 5 to 250 times faster than DC3 and the conventional solvers in solving constrained convex, non-convex optimization, and/or AC-OPF.	公開日:2024-09-23 翻訳日:2024-11-09 15:02:22
# 高次元過度線形回帰における最小ノルムリスクのバッチ安定化 Batches Stabilize the Minimum Norm Risk in High Dimensional Overparameterized Linear Regression ( http://arxiv.org/abs/2306.08432v3 ) ライセンス: Link先を確認	Shahar Stein Ioushua, Inbar Hasidim, Ofer Shayevitz, Meir Feder,	(参考訳) データをバッチに分割する学習アルゴリズムは、多くの機械学習アプリケーションで一般的であり、典型的には計算効率と性能のトレードオフを提供する。本稿では,等方的ガウス特徴を持つ最小ノルム過パラメータ線形回帰モデルのレンズによるバッチ分割の利点について検討する。最小ノルム推定器の自然な小バッチ版を提案し、その二次リスクを導出する。次に、最適なバッチサイズを特徴付け、ノイズレベルと過度パラメータ比に逆比例することを示す。最小ノルムとは対照的に,我々の推定器は過パラメトリゼーション比で単調に増加する安定なリスク挙動を認め,補間点での爆発と二重発振現象の両方を除去する。さらに、Weiner係数に等しい係数によるバッチ最小ノルム推定器の縮小がさらに安定化し、全ての設定において2次リスクを低くすることを示した。興味深いことに、バッチパーティションによって提供される暗黙の正規化は、バッチ間の機能の重複によって部分的に説明される。我々の境界は、新しい手法の組み合わせ、特にランダム部分空間上の雑音射影のワッサーシュタイン計量の正規近似によって導かれる。 Learning algorithms that divide the data into batches are prevalent in many machine-learning applications, typically offering useful trade-offs between computational efficiency and performance. In this paper, we examine the benefits of batch-partitioning through the lens of a minimum-norm overparametrized linear regression model with isotropic Gaussian features. We suggest a natural small-batch version of the minimum-norm estimator and derive bounds on its quadratic risk. We then characterize the optimal batch size and show it is inversely proportional to the noise level, as well as to the overparametrization ratio. In contrast to minimum-norm, our estimator admits a stable risk behavior that is monotonically increasing in the overparametrization ratio, eliminating both the blowup at the interpolation point and the double-descent phenomenon. We further show that shrinking the batch minimum-norm estimator by a factor equal to the Weiner coefficient further stabilizes it and results in lower quadratic risk in all settings. Interestingly, we observe that the implicit regularization offered by the batch partition is partially explained by feature overlap between the batches. Our bound is derived via a novel combination of techniques, in particular normal approximation in the Wasserstein metric of noisy projections over random subspaces.	公開日:2024-09-21 翻訳日:2024-11-09 15:02:22
# 平板最小値探索のための雑音安定性最適化:ヘッセン系正規化手法 Noise Stability Optimization for Finding Flat Minima: A Hessian-based Regularization Approach ( http://arxiv.org/abs/2306.08553v4 ) ライセンス: Link先を確認	Hongyang R. Zhang, Dongyue Li, Haotian Ju,	(参考訳) 過度にパラメータ化されたニューラルネットワークのトレーニングは、最近の文献で多くの研究を受けている。重要な考慮事項は、その非凸性や非線形幾何学のため、過度にパラメータ化されたネットワークの正規化である。本稿では、損失のヘシアンを正規化できるノイズ注入アルゴリズムについて検討し、平面的な損失面を持つ領域を導出する。具体的には、ニューラルネットワークの重み行列に等方性ガウスノイズを注入することにより、ヘッセンの痕跡のほぼ偏りのない推定値を得ることができる。しかし、バックプロパゲーション前に重み行列にノイズを加えることでノイズ注入を鼻で行うと、経験的改善は限られる。この制限に対処するために、ランダムノイズの正方向と負方向の両方に沿って重み行列に雑音を注入するヘッセンペナルティの2点推定を設計する。特に、この2点推定は、ヘッセン上の一階テイラーの展開項の分散を排除している。我々は、データから測定できるヘッセン(および重み空間の半径)のトレースに依存するPAC-ベイズ一般化の有界性を示す。我々は,我々のアプローチを検証するための詳細な実験を行い,ヘッセン語を効果的に正則化し,一般化を向上させることができることを示す。まず,6つの画像分類データセット上での微調整ResNetの精度を最大2.4%向上させることができる。さらに、ヘッセンの痕跡は15.8%減少し、最大の固有値は我々のアプローチにより9.7%減少する。また、ヘッセンの正則化と重みの減衰とデータ増大が組み合わされ、より強い正則化がもたらされる。第2に,本手法はマルチモーダルCLIPモデルとチェーン・オブ・ファインタニングの事前学習における一般化の改善に有効である。 The training of over-parameterized neural networks has received much study in recent literature. An important consideration is the regularization of over-parameterized networks due to their highly nonconvex and nonlinear geometry. In this paper, we study noise injection algorithms, which can regularize the Hessian of the loss, leading to regions with flat loss surfaces. Specifically, by injecting isotropic Gaussian noise into the weight matrices of a neural network, we can obtain an approximately unbiased estimate of the trace of the Hessian. However, naively implementing the noise injection via adding noise to the weight matrices before backpropagation presents limited empirical improvements. To address this limitation, we design a two-point estimate of the Hessian penalty, which injects noise into the weight matrices along both positive and negative directions of the random noise. In particular, this two-point estimate eliminates the variance of the first-order Taylor's expansion term on the Hessian. We show a PAC-Bayes generalization bound that depends on the trace of the Hessian (and the radius of the weight space), which can be measured from data. We conduct a detailed experimental study to validate our approach and show that it can effectively regularize the Hessian and improve generalization. First, our algorithm can outperform prior approaches on sharpness-reduced training, delivering up to a 2.4% test accuracy increase for fine-tuning ResNets on six image classification datasets. Moreover, the trace of the Hessian reduces by 15.8%, and the largest eigenvalue is reduced by 9.7% with our approach. We also find that the regularization of the Hessian can be combined with weight decay and data augmentation, leading to stronger regularization. Second, our approach remains effective for improving generalization in pretraining multimodal CLIP models and chain-of-thought fine-tuning.	公開日:2024-09-23 翻訳日:2024-11-09 15:02:22
# OpenOOD v1.5: アウト・オブ・ディストリビューション検出のためのベンチマーク強化 OpenOOD v1.5: Enhanced Benchmark for Out-of-Distribution Detection ( http://arxiv.org/abs/2306.09301v4 ) ライセンス: Link先を確認	Jingyang Zhang, Jingkang Yang, Pengyun Wang, Haoqi Wang, Yueqian Lin, Haoran Zhang, Yiyou Sun, Xuefeng Du, Yixuan Li, Ziwei Liu, Yiran Chen, Hai Li,	(参考訳) アウト・オブ・ディストリビューション(OOD)検出は、オープンワールド・インテリジェントシステムの信頼性の高い運用に不可欠である。 OOD検出手法の出現にもかかわらず、評価の不整合は、この分野の進歩を追跡する上での課題である。 OpenOOD v1はOOD検出評価の統合を開始したが、スケーラビリティとユーザビリティの制限に直面した。本報告では,OOD検出手法の精度,標準化,ユーザフレンドリな評価を保証したOpenOOD v1.5を提案する。特に、OpenOOD v1.5は、評価機能をImageNetなどの大規模データセットに拡張し、未調査の重要でないフルスペクトルOOD検出を調査し、オンラインリーダーボードや使いやすい評価器などの新機能を導入している。この研究は、総合的な実験結果から得られた深い分析や洞察にも貢献し、OOD検出手法の知識プールを強化している。これらの拡張により、OpenOOD v1.5は進歩を加速し、OOD検出研究のためのより堅牢で包括的な評価ベンチマークを提供することを目的としている。 Out-of-Distribution (OOD) detection is critical for the reliable operation of open-world intelligent systems. Despite the emergence of an increasing number of OOD detection methods, the evaluation inconsistencies present challenges for tracking the progress in this field. OpenOOD v1 initiated the unification of the OOD detection evaluation but faced limitations in scalability and usability. In response, this paper presents OpenOOD v1.5, a significant improvement from its predecessor that ensures accurate, standardized, and user-friendly evaluation of OOD detection methodologies. Notably, OpenOOD v1.5 extends its evaluation capabilities to large-scale datasets such as ImageNet, investigates full-spectrum OOD detection which is important yet underexplored, and introduces new features including an online leaderboard and an easy-to-use evaluator. This work also contributes in-depth analysis and insights derived from comprehensive experimental results, thereby enriching the knowledge pool of OOD detection methodologies. With these enhancements, OpenOOD v1.5 aims to drive advancements and offer a more robust and comprehensive evaluation benchmark for OOD detection research.	公開日:2024-09-24 翻訳日:2024-11-09 15:02:22
# 時空間量子相関の因果分類 Causal classification of spatiotemporal quantum correlations ( http://arxiv.org/abs/2306.09336v2 ) ライセンス: Link先を確認	Minjeong Song, Varun Narasimhachar, Bartosz Regula, Thomas J. Elliott, Mile Gu,	(参考訳) 測定結果のみの相関から、そのような相関が一時的なものであるかどうかを2つの孤立した当事者が決定できるだろうか? つまり、2つの異なるタイミングで同じシステムを与えられたと判断できるのだろうか? 古典的な統計によると、量子論は一致しない。ここでは、そのような量子相関を時間的に特定できる必要十分条件を紹介する。時間反転下での時間的非対称性を実証し,空間的量子相関の尺度であることを明らかにした。以上の結果から,特定の量子相関は時間的固有矢印を持ち,様々な因果構造との整合性に基づいて,時空間における一般量子相関の分類が可能であることが示唆された。 From correlations in measurement outcomes alone, can two otherwise isolated parties establish whether such correlations are atemporal? That is, can they rule out that they have been given the same system at two different times? Classical statistics says no, yet quantum theory disagrees. Here, we introduce the necessary and sufficient conditions by which such quantum correlations can be identified as atemporal. We demonstrate the asymmetry of atemporality under time reversal, and reveal it to be a measure of spatial quantum correlation distinct from entanglement. Our results indicate that certain quantum correlations possess an intrinsic arrow of time, and enable classification of general quantum correlations across space-time based on their (in)compatibility with various underlying causal structures.	公開日:2024-09-23 翻訳日:2024-11-09 15:02:22
# 直接検出のための光学式暗黒物質計 Optomechanical dark matter instrument for direct detection ( http://arxiv.org/abs/2306.09726v2 ) ライセンス: Link先を確認	Christopher G. Baker, Warwick P. Bowen, Peter Cox, Matthew J. Dolan, Maxim Goryachev, Glen Harris,	(参考訳) 低質量暗黒物質を直接検出するための新しい手法を応用したオプトメカニカルダークマターインストゥルメント(ODIN)を提案する。我々は,超流動ヘリウムと相互作用する暗黒物質を光学的空洞で考える。有効場理論を用いて,暗黒物質がフォノンから散乱する速度を,高密度で駆動されるキャビティの音響モードで計算する。この散乱過程は、フォノンを基底状態の第2音響モードに堆積させる。堆積されたフォノン (\mu$eV range) は、ポンプレーザーとの光学的相互作用により光子(eV range)に変換される。この光子を効率よく検出することができ、keVスケールの暗黒物質を感度よくプローブする手段を提供する。我々は,背景の現実的な推定を行い,そのような実験に関連する技術的課題について議論する。我々は、0.5から300keVまでの暗黒物質質量に対する暗黒物質-核子相互作用の予測限界を計算し、将来の装置が$\mathcal{O}(10^{-32})$ cm$^2$の低い断面を探査できると推定した。 We propose the Optomechanical Dark-matter INstrument (ODIN), based on a new method for the direct detection of low-mass dark matter. We consider dark matter interacting with superfluid helium in an optomechanical cavity. Using an effective field theory, we calculate the rate at which dark matter scatters off phonons in a highly populated, driven acoustic mode of the cavity. This scattering process deposits a phonon into a second acoustic mode in its ground state. The deposited phonon ($\mu$eV range) is then converted to a photon (eV range) via an optomechanical interaction with a pump laser. This photon can be efficiently detected, providing a means to sensitively probe keV scale dark matter. We provide realistic estimates of the backgrounds and discuss the technical challenges associated with such an experiment. We calculate projected limits on dark matter-nucleon interactions for dark matter masses ranging from 0.5 to 300 keV and estimate that a future device could probe cross-sections as low as $\mathcal{O}(10^{-32})$ cm$^2$.	公開日:2024-09-24 翻訳日:2024-11-09 14:51:04
# フェデレーション学習のための視覚変換器の連続的適応 Continual Adaptation of Vision Transformers for Federated Learning ( http://arxiv.org/abs/2306.09970v2 ) ライセンス: Link先を確認	Shaunak Halbe, James Seale Smith, Junjiao Tian, Zsolt Kira,	(参考訳) 本稿では,サーバがクライアントの集合と通信し,データを共有したり保存したりすることなく,新たな概念を段階的に学習する,CFL(Continuousal Federated Learning)の重要な課題に焦点を当てる。この問題の複雑さは、継続学習とフェデレート学習の両方の観点からの課題によって複雑化されます。具体的には、CFLセットアップでトレーニングされたモデルは、クライアント間のデータの異質性によって悪化する破滅的な忘れ込みに悩まされる。この問題に対する既存の試みは、クライアントや通信チャネルに大きなオーバーヘッドを課す傾向にあり、あるいは保存されたデータにアクセスする必要があるため、プライバシによる実際の使用には適さない。本稿では,記憶データへのアクセスを必要とせず,オーバーヘッドコストを最小限に抑えながら,忘れと不均一性に取り組む。本研究では,視覚変換器の文脈でこの問題を考察し,動的分布に適応するパラメータ効率のアプローチを,最小限に抑えながら検討する。我々は、プロンプトベースのアプローチ(プロンプトとクラシファイアヘッドのみを通信しなければならない)を活用し、サーバにおけるクライアントモデルを統合するための、新しくて軽量な生成と蒸留方式を提案する。我々は、画像分類の問題を定式化し、比較のための強力なベースラインを確立し、CIFAR-100上で実験を行い、ImageNet-RやDomainNetのような大規模データセットに挑戦する。提案手法は,通信コストとクライアントレベルの計算コストを大幅に削減しつつ,既存手法と独自のベースラインを最大7%向上させる。コードはhttps://github.com/shaunak27/hepco-fed.comで公開されている。 In this paper, we focus on the important yet understudied problem of Continual Federated Learning (CFL), where a server communicates with a set of clients to incrementally learn new concepts over time without sharing or storing any data. The complexity of this problem is compounded by challenges from both the Continual and Federated Learning perspectives. Specifically, models trained in a CFL setup suffer from catastrophic forgetting which is exacerbated by data heterogeneity across clients. Existing attempts at this problem tend to impose large overheads on clients and communication channels or require access to stored data which renders them unsuitable for real-world use due to privacy. In this paper, we attempt to tackle forgetting and heterogeneity while minimizing overhead costs and without requiring access to any stored data. We study this problem in the context of Vision Transformers and explore parameter-efficient approaches to adapt to dynamic distributions while minimizing forgetting. We achieve this by leveraging a prompting based approach (such that only prompts and classifier heads have to be communicated) and proposing a novel and lightweight generation and distillation scheme to consolidate client models at the server. We formulate this problem for image classification and establish strong baselines for comparison, conduct experiments on CIFAR-100 as well as challenging, large-scale datasets like ImageNet-R and DomainNet. Our approach outperforms both existing methods and our own baselines by as much as 7% while significantly reducing communication and client-level computation costs. Code available at https://github.com/shaunak27/hepco-fed.	公開日:2024-09-22 翻訳日:2024-11-09 14:51:04
# 準周期モザイク格子における多動性エッジの探索 Probing multi-mobility edges in quasiperiodic mosaic lattices ( http://arxiv.org/abs/2306.10829v2 ) ライセンス: Link先を確認	Jun Gao, Ivan M. Khaymovich, Xiao-Wei Wang, Ze-Sheng Xu, Adrian Iovan, Govind Krishna, Jiayidaer Jieensi, Andrea Cataldo, Alexander V. Balatsky, Val Zwiller, Ali W. Elshaari,	(参考訳) モビリティエッジ(ME)は、エネルギースペクトルにおける局所化状態と局所化状態の間の重要な遷移を示す、局在化物理学を理解するための重要な概念である。アンダーソン局在化スケーリング理論は、低次元系におけるMEの欠如を予測する。そのため、特に低次元の単一粒子に対する正確なMEの探索は、最近理論と実験的研究の両方に大きな関心を集め、顕著な進歩をもたらした。しかし、複数のMEを示す単一のシステムや、強い障害領域内であっても、拡張状態の持続的な存在の可能性など、いくつかのオープンな疑問が残っている。ここでは、準周期モザイク格子と精密に設計されたナノフォトニック回路を用いて、これらの問題に対処する実験的な証拠を提供する。本研究は, 2次対称性の破れと変調周期の異なる格子における拡張状態と局所状態の共存を実証するものである。単一サイトインジェクションと障害レベルの走査により,変調格子のMEを概ね調査することができた。これらの結果は、最近の理論予測を裏付け、ME物理を研究するための新しい道を導入し、ハイブリッド集積フォトニックデバイスを用いた量子状態におけるME物理のさらなる探索にインスピレーションを与える。 The mobility edge (ME) is a crucial concept in understanding localization physics, marking the critical transition between extended and localized states in the energy spectrum. Anderson localization scaling theory predicts the absence of ME in lower dimensional systems. Hence, the search for exact MEs, particularly for single particles in lower dimensions, has recently garnered significant interest in both theoretical and experimental studies, resulting in notable progress. However, several open questions remain, including the possibility of a single system exhibiting multiple MEs and the continual existence of extended states, even within the strong disorder domain. Here, we provide experimental evidence to address these questions by utilizing a quasiperiodic mosaic lattice with meticulously designed nanophotonic circuits. Our observations demonstrate the coexistence of both extended and localized states in lattices with broken duality symmetry and varying modulation periods. By single site injection and scanning the disorder level, we could approximately probe the ME of the modulated lattice. These results corroborate recent theoretical predictions, introduce a new avenue for investigating ME physics, and offer inspiration for further exploration of ME physics in the quantum regime using hybrid integrated photonic devices.	公開日:2024-09-23 翻訳日:2024-11-09 14:51:04
# 条件付きデュアルオートエンコーダによる暗黒ショータのトリガ Triggering Dark Showers with Conditional Dual Auto-Encoders ( http://arxiv.org/abs/2306.12955v2 ) ライセンス: Link先を確認	Luca Anzalone, Simranjit Singh Chhibra, Benedikt Maier, Nadezda Chernyavskaya, Maurizio Pierini,	(参考訳) 本稿では,コライダにおける一般およびモデルに依存しない新しい物理探索のための条件付きデュアルオートエンコーダ(CoDAE)のファミリーを提案する。新たな種類の粒子や相互作用から生じる新しい物理信号は、予測される背景事象に対するデータの偏差を引き起こす異常であると考えられる。本研究では,背景サンプルのみを用いた正常な異常検出を行い,物理ベースの前処理や信号に対する強い仮定を使わずに,大規模かつ疎度な生検出器画像に(変分的)オートエンコーダを適用した強力のダークバージョンを探索する。提案したCoDAEは双対エンコーダ設計であり、空間条件付けにより補助的かつコンパクトなラテント空間を学習できる。 ATLASやCMSのような大型ハドロン衝突型加速器実験のリアルタイムイベントトリガシステムにおいて、この手法が正確で高速でモデルに依存しないアルゴリズムとして適用可能であることを示すため、教師なしモデルが複数のダークシャワーモデルに対して優れた差別を示すことは初めてである。 We present a family of conditional dual auto-encoders (CoDAEs) for generic and model-independent new physics searches at colliders. New physics signals, which arise from new types of particles and interactions, are considered in our study as anomalies causing deviations in data with respect to expected background events. In this work, we perform a normal-only anomaly detection, which employs only background samples, to search for manifestations of a dark version of strong force applying (variational) auto-encoders on raw detector images, which are large and highly sparse, without leveraging any physics-based pre-processing or strong assumption on the signals. The proposed CoDAE has a dual-encoder design, which is general and can learn an auxiliary yet compact latent space through spatial conditioning, showing a neat improvement over competitive physics-based baselines and related approaches, therefore also reducing the gap with fully supervised models. It is the first time an unsupervised model is shown to exhibit excellent discrimination against multiple dark shower models, illustrating the suitability of this method as an accurate, fast, model-independent algorithm to deploy, e.g., in the real-time event triggering systems of Large Hadron Collider experiments such as ATLAS and CMS.	公開日:2024-09-24 翻訳日:2024-11-09 14:51:04
# HamLib: 量子アルゴリズムとハードウェアのベンチマークのためのハミルトンのライブラリ HamLib: A library of Hamiltonians for benchmarking quantum algorithms and hardware ( http://arxiv.org/abs/2306.13126v4 ) ライセンス: Link先を確認	Nicolas PD Sawaya, Daniel Marti-Dafcik, Yang Ho, Daniel P Tabor, David E Bernal Neira, Alicia B Magann, Shavindra Premaratne, Pradeep Dubey, Anne Matsuura, Nathan Bishop, Wibe A de Jong, Simon Benjamin, Ojas Parekh, Norm Tubman, Katherine Klymko, Daan Camps,	(参考訳) 計算ハードウェア、ソフトウェア、アルゴリズムを特徴付け、ベンチマークするためには、多くの問題インスタンスを手元に持つことが不可欠である。これは量子計算に当てはまるものではなく、実世界の問題インスタンスの集合がベンチマーク研究を可能にし、アルゴリズムとハードウェアの設計の両方を改善するのに役立つ。この目的のために、量子ハミルトニアンの大規模なデータセットを提示する。 HamLib(ハミルトン図書館)と呼ばれるこのデータセットは、オンラインで無料で利用可能であり、2から1000キュービットまでの問題サイズを含んでいる。 HamLibには、Heisenbergモデル、Fermi-Hubbardモデル、Bose-Hubbardモデル、分子電子構造、分子振動構造、MaxCut、Max-$k$-SAT、Max-$k$-Cut、QMaxCut、旅行セールスパーソンの問題が含まれている。この努力の目標は (a)問題インスタンスを作成してキュービット表現にマッピングする必要をなくして研究者の時間を節約すること。 (b)新しいアルゴリズムやハードウェアをより徹底的にテストできるようにし、 (c) 研究における再現性と標準化を可能にすること。 In order to characterize and benchmark computational hardware, software, and algorithms, it is essential to have many problem instances on-hand. This is no less true for quantum computation, where a large collection of real-world problem instances would allow for benchmarking studies that in turn help to improve both algorithms and hardware designs. To this end, here we present a large dataset of qubit-based quantum Hamiltonians. The dataset, called HamLib (for Hamiltonian Library), is freely available online and contains problem sizes ranging from 2 to 1000 qubits. HamLib includes problem instances of the Heisenberg model, Fermi-Hubbard model, Bose-Hubbard model, molecular electronic structure, molecular vibrational structure, MaxCut, Max-$k$-SAT, Max-$k$-Cut, QMaxCut, and the traveling salesperson problem. The goals of this effort are (a) to save researchers time by eliminating the need to prepare problem instances and map them to qubit representations, (b) to allow for more thorough tests of new algorithms and hardware, and (c) to allow for reproducibility and standardization across research studies.	公開日:2024-09-24 翻訳日:2024-11-09 14:51:04
# Universal Session Protocol: リモートコードの実行に対する一般的な解決策 Universal Session Protocol: A General Solution to Remote Code Execution ( http://arxiv.org/abs/2306.14339v2 ) ライセンス: Link先を確認	Jonathon Anderson,	(参考訳) 現在、TCP/IPモデルは、アプリケーションへの接続に対するすべての要求を無条件で満たすことで、匿名で脆弱性を悪用することができる。私は、TCP/IPモデルのアーキテクチャの変更としてユニバーサルセッションプロトコルを提案しており、認証交渉と履行のための構造化された汎用プロセスを含むセッション層を含んでいます。ユニバーサルセッションプロトコルは、セキュリティクリティカルシステムにおける不正なデータ処理を排除する緊急かつ重要な必要性に対処する。 TCP/IPセキュリティに関するこれまでの研究は、アプリケーション設計と実装、および既存のプロトコル層に重点を置いていたが、緩和制御としてセッション層を追加することに失敗した。異なる認証レイヤを実装することに失敗すると、ライフとセキュリティクリティカルなインフラストラクチャを含む、グローバルインターネットに接続されたすべてのリソースが、匿名で追跡不能なソースからの攻撃に脆弱になる。 Universal Session ProtocolはTCP/IP Session Layerを確立することでソリューションを提供する。認証後、IDはデータストリームに関連付けられ、すべてのデータが法医学的な目的のためにそのIDに関連付けられている可能性がある。認証が失敗した場合、アプリケーションはユーザーデータを決して処理せず、サービスは匿名の悪いアクターから安全になる。 Currently, the TCP/IP model enables exploitation of vulnerabilities anonymously by unconditionally fulfilling every request for a connection into an application; the model only incorporates authentication within applications themselves, rather than as a precondition for access into applications. I am proposing the Universal Session Protocol as a change to the architecture of the TCP/IP model to include a session layer featuring a structured generalized process for authentication negotiation and fulfillment. The Universal Session Protocol addresses an urgent and vital need to eliminate unauthenticated data processing on security critical systems. Previous work regarding TCP/IP security has focused on the application design and implementation and existing protocol layers, but has failed to posit the addition of a session layer as a mitigating control. Failing to implement a distinct authentication layer leaves every resource connected to the global Internet, including life and security critical infrastructure, vulnerable to attacks from anonymous and untraceable sources. The Universal Session Protocol provides a solution by establishing a TCP/IP Session Layer that explicitly provides authentication before a data stream is accessible within an application. After authentication, an identity is associated with the data stream so that all data may be related back to that identity for forensic purposes. If authentication fails, the application will never process user data, rendering the service safe from anonymous bad actors.	公開日:2024-09-24 翻訳日:2024-11-09 14:51:04
# 時間と状態依存型ニューラル遅延微分方程式 Time and State Dependent Neural Delay Differential Equations ( http://arxiv.org/abs/2306.14545v2 ) ライセンス: Link先を確認	Thibault Monsel, Onofrio Semeraro, Lionel Mathelin, Guillaume Charpiat,	(参考訳) 物理学や工学から医学、経済学まで、幅広い種類の問題の統治方程式において、不連続性と遅延項が遭遇する。これらのシステムは、標準常微分方程式(ODE)やニューラル常微分方程式(NODE)のようなデータ駆動近似で適切にモデル化およびシミュレーションすることはできない。この問題を回避するために、潜伏変数は一般に高次元空間における系の力学を解き、元の空間への射影として解を得るために導入される。しかし、この解は物理的解釈可能性に欠ける。対照的に、DDE(Delay Differential Equations)とそのデータ駆動の近似方程式は、このようなシステムを特徴づける良い候補として自然に現れる。本稿では,複数および状態依存遅延をモデル化可能な汎用かつ柔軟なフレームワークであるNeural State-Dependent DDE(SDDDE)を導入することで,最近提案されたNeural DDEを再考する。提案手法は競争力があり,様々な遅延力学系における他の連続クラスモデルよりも優れていることを示す。コードはリポジトリ \href{https://github.com/thibmonsel/Time-and-State-Dependent-Neural-Delay-Differential-Equations}{here} で公開されている。 Discontinuities and delayed terms are encountered in the governing equations of a large class of problems ranging from physics and engineering to medicine and economics. These systems cannot be properly modelled and simulated with standard Ordinary Differential Equations (ODE), or data-driven approximations such as Neural Ordinary Differential Equations (NODE). To circumvent this issue, latent variables are typically introduced to solve the dynamics of the system in a higher dimensional space and obtain the solution as a projection to the original space. However, this solution lacks physical interpretability. In contrast, Delay Differential Equations (DDEs), and their data-driven approximated counterparts, naturally appear as good candidates to characterize such systems. In this work we revisit the recently proposed Neural DDE by introducing Neural State-Dependent DDE (SDDDE), a general and flexible framework that can model multiple and state- and time-dependent delays. We show that our method is competitive and outperforms other continuous-class models on a wide variety of delayed dynamical systems. Code is available at the repository \href{https://github.com/thibmonsel/Time-and-State-Dependent-Neural-Delay-Differential-Equations}{here}.	公開日:2024-09-26 翻訳日:2024-11-09 14:51:04
# 4重境界誤差再分別による高品質未知オブジェクトインスタンスセグメンテーション High-quality Unknown Object Instance Segmentation via Quadruple Boundary Error Refinement ( http://arxiv.org/abs/2306.16132v3 ) ライセンス: Link先を確認	Seunghyeok Back, Sangbeom Lee, Kangmin Kim, Joosoon Lee, Sungho Shin, Jemo Maeng, Kyoobin Lee,	(参考訳) 非構造環境における未知の物体の高精度かつ効率的なセグメンテーションは、ロボット操作に不可欠である。 Unknown Object Instance Segmentation (UOIS)は、未知のカテゴリやバックグラウンドのすべてのオブジェクトを識別することを目的としており、様々なロボットタスクにおいて重要な機能となっている。しかし、現在の手法は過剰なセグメンテーションと過度のセグメンテーションに苦しむため、把握のような操作タスクでは失敗する。これらの課題に対処するため,我々は高品質なUOISのための新しい誤り情報処理手法QuBER(Quadruple boundary Error Refinement)を提案する。 QuBERはまず、初期セグメンテーションのインスタンス境界における4倍境界誤差-真正、真負、偽正、偽負の画素-を推定する。その後、エラー誘導融合機構を使用してセグメンテーションを洗練し、細粒度とインスタンスレベルのセグメンテーションエラーを効果的に補正する。 3つの公開ベンチマークの大規模な評価は、QuBERが最先端の手法より優れており、継続的に様々なUOIS技術を改善しつつ、0.1秒未満の高速な推論時間を維持していることを示している。さらに,QuBERは,乱雑な環境下での対象オブジェクトの把握の成功率を向上させることを実証した。コードと補足資料はhttps://sites.google.com/view/uois-quber.comで入手できる。 Accurate and efficient segmentation of unknown objects in unstructured environments is essential for robotic manipulation. Unknown Object Instance Segmentation (UOIS), which aims to identify all objects in unknown categories and backgrounds, has become a key capability for various robotic tasks. However, current methods struggle with over-segmentation and under-segmentation, leading to failures in manipulation tasks such as grasping. To address these challenges, we propose QuBER (Quadruple Boundary Error Refinement), a novel error-informed refinement approach for high-quality UOIS. QuBER first estimates quadruple boundary errors-true positive, true negative, false positive, and false negative pixels-at the instance boundaries of the initial segmentation. It then refines the segmentation using an error-guided fusion mechanism, effectively correcting both fine-grained and instance-level segmentation errors. Extensive evaluations on three public benchmarks demonstrate that QuBER outperforms state-of-the-art methods and consistently improves various UOIS techniques while maintaining a fast inference time of less than 0.1 seconds. Additionally, we demonstrate that QuBER improves the success rate of grasping target objects in cluttered environments. Code and supplementary materials are available at https://sites.google.com/view/uois-quber.	公開日:2024-09-23 翻訳日:2024-11-09 14:51:04
# ボソニックガウス流路の低地・高地容量領域解析 Low-ground/High ground capacity regions analysis for Bosonic Gaussian Channels ( http://arxiv.org/abs/2306.16350v2 ) ライセンス: Link先を確認	Farzad Kianvash, Marco Fanizza, Vittorio Giovannetti,	(参考訳) 本稿では, 単一モード, 位相非感受性ガウスボソニックチャネル間の相互接続の包括的特性について述べる。この特徴付けにより、これらのマップのパラメータ空間において、低地と高地という2つの異なる領域を特定できる。低地領域では、情報容量は指定基準値よりも小さく、高地領域では、確実に大きい。直接的な結果として、これらの写像の量子的およびプライベートな容量について、既知の上界と合成規則を組み合わせた明示的な上界の集合を体系的に概説し、既存の結果を改善する。 We present a comprehensive characterization of the interconnections between single-mode, phaseinsensitive Gaussian Bosonic Channels resulting from channel concatenation. This characterization enables us to identify, in the parameter space of these maps, two distinct regions: low-ground and high-ground. In the low-ground region, the information capacities are smaller than a designated reference value, while in the high-ground region, they are provably greater. As a direct consequence, we systematically outline an explicit set of upper bounds for the quantum and private capacity of these maps, which combine known upper bounds and composition rules, improving upon existing results.	公開日:2024-09-26 翻訳日:2024-11-09 14:51:04
# オルタナティブ・テレスコープ・アライメント : 効率的なマルチモーダルアライメント法 Alternative Telescopic Displacement: An Efficient Multimodal Alignment Method ( http://arxiv.org/abs/2306.16950v4 ) ライセンス: Link先を確認	Jiahao Qin, Yitao Xu, Zong Lu, Xiaojun Zhang,	(参考訳) マルチモーダルデータ統合の領域では、機能アライメントが重要な役割を果たす。本稿では,マルチモーダル情報の融合に革命をもたらす機能アライメントに対する革新的なアプローチを提案する。提案手法では,異なるモードをまたいだ特徴表現の遠隔的変位と拡張の新たな反復的プロセスを用いて,共有特徴空間内の一貫性のある統一表現を導出する。この高度な技術は、抽象の最高レベルにおいて複雑なクロスモーダル相互作用を捕捉し、活用する驚くべき能力を示している。その結果,マルチモーダル学習タスクの性能は大幅に向上した。厳密な比較分析により、様々なアプリケーションにまたがる既存のマルチモーダル融合パラダイムに対するアプローチの優位性を確立する。時系列,視覚データ,テキスト情報を含む多面的データセットを用いた総合的な経験的評価は,本手法がこの分野における前例のないベンチマークを達成していることを示す証拠となる。この研究は、マルチモーダル学習における最先端の進歩だけでなく、複雑な分析シナリオにおける異なるデータモダリティ間の相乗効果を探求するための新たな道を開いた。 In the realm of multimodal data integration, feature alignment plays a pivotal role. This paper introduces an innovative approach to feature alignment that revolutionizes the fusion of multimodal information. Our method employs a novel iterative process of telescopic displacement and expansion of feature representations across different modalities, culminating in a coherent unified representation within a shared feature space. This sophisticated technique demonstrates a remarkable ability to capture and leverage complex crossmodal interactions at the highest levels of abstraction. As a result, we observe significant enhancements in the performance of multimodal learning tasks. Through rigorous comparative analysis, we establish the superiority of our approach over existing multimodal fusion paradigms across a diverse array of applications. Comprehensive empirical evaluations conducted on multifaceted datasets encompassing temporal sequences, visual data, and textual information provide compelling evidence that our method achieves unprecedented benchmarks in the field. This work not only advances the state of the art in multimodal learning but also opens new avenues for exploring the synergies between disparate data modalities in complex analytical scenarios.	公開日:2024-09-25 翻訳日:2024-11-09 14:51:04
# QAOAのためのLXミキサー:部分空間に制限された最適ミキサーと安定化器形式 LX-mixers for QAOA: Optimal mixers restricted to subspaces and the stabilizer formalism ( http://arxiv.org/abs/2306.17083v6 ) ライセンス: Link先を確認	Franz G. Fuchs, Ruben Pariente Bassa,	(参考訳) 与えられた部分空間を保存するミキサーの理解と構築を両立させる新しい形式主義を提示する。この方法は、誤り訂正符号に使用される安定化器形式を接続して利用する。これは、組合せ最適化問題の解法として一般的なメタヒューリスティックである量子近似最適化アルゴリズム(QAOA)が、問題の制約が大きくて容易に指定可能な部分空間に導かれるような設定に適用される場合に有用である。提案手法は,制御されたノットゲートの数で資源効率のよいミキサーを構築する体系的な方法を提供し,よく知られたXとXYミキサーの一般化とGroverミキサーの緩和と理解することができる。得られた数値例では, 従来の結果と比較してCXゲートが劇的に減少していた。我々は、この部分空間を安定化器Sの符号空間に分割し、これらの符号空間に関連する論理回転Xゲートを連続的に適用するものとして理解することができるので、我々のアプローチを論理X-Mixerあるいは論理X QAOA(\textbf{LX-QAOA}$)と呼ぶ。全体として、この新しい視点が量子アルゴリズムの発展に関するさらなる洞察に繋がることを願っている。 We present a novel formalism to both understand and construct mixers that preserve a given subspace. The method connects and utilizes the stabilizer formalism that is used in error correcting codes. This can be useful in the setting when the quantum approximate optimization algorithm (QAOA), a popular meta-heuristic for solving combinatorial optimization problems, is applied in the setting where the constraints of the problem lead to a feasible subspace that is large but easy to specify. The proposed method gives a systematic way to construct mixers that are resource efficient in the number of controlled not gates and can be understood as a generalization of the well-known X and XY mixers and a relaxation of the Grover mixer: Given a basis of any subspace, a resource efficient mixer can be constructed that preserves the subspace. The numerical examples provided show a dramatic reduction of CX gates when compared to previous results. We call our approach logical X-Mixer or logical X QAOA ($\textbf{LX-QAOA}$), since it can be understood as dividing the subspace into code spaces of stabilizers S and consecutively applying logical rotational X gates associated with these code spaces. Overall, we hope that this new perspective can lead to further insight into the development of quantum algorithms.	公開日:2024-09-23 翻訳日:2024-11-09 14:51:04
# 拡散モデルによる色調の定式化と色移動 Dequantization and Color Transfer with Diffusion Models ( http://arxiv.org/abs/2307.02698v4 ) ライセンス: Link先を確認	Vaibhav Vavilala, Faaris Shaik, David Forsyth,	(参考訳) 自然画像の新規な画像編集を可能にする拡散モデルを提案する。パッチベースの編集やパレット転送を簡単に抽象化できるため,量子化画像の操作を提案する。特に,カラーパレットが拡散モデルの出力を制御し,解釈しやすくすることを示す。まず,JPEGノイズ低減モデルなど,既存の画像復元手法では不十分であることが確認された。次に、我々のモデルが、ユーザが要求したカラーパレットを尊重する自然な画像を生成できることを実証する。パレット転送のために,重み付き二分節マッチングに基づく手法を提案する。そこで本モデルでは, 極端なパレット転送後であっても, ユーザクエリを尊重して, 可視画像を生成することを示す。本手法は、画像の一部または全部のソーステクスチャを任意に条件付けすることができる。これにより、入力と異なる輝度で色を生成できない既存の画像カラー化手法において、一般的な問題を克服する。テクスチャコンディショニングや,輝度,画像勾配,しきい値勾配など,テクスチャコンディショニングとトレードオフの可能性を評価し,テクスチャコンディショニングとカラーコントロールの両立に最善を尽くした。本手法は,画像のテクスチャを尊重しながら,画像のパッチを塗り替えることによって,別の実用的な編集に拡張することができる。我々の手順は、いくつかの質的、定量的な評価によって支えられている。 We demonstrate an image dequantizing diffusion model that enables novel image edits on natural images. We propose operating on quantized images because they offer easy abstraction for patch-based edits and palette transfer. In particular, we show that color palettes can make the output of the diffusion model easier to control and interpret. We first establish that existing image restoration methods are not sufficient, such as JPEG noise reduction models. We then demonstrate that our model can generate natural images that respect the color palette the user asked for. For palette transfer, we propose a method based on weighted bipartite matching. We then show that our model generates plausible images even after extreme palette transfers, respecting user query. Our method can optionally condition on the source texture in part or all of the image. In doing so, we overcome a common problem in existing image colorization methods that are unable to produce colors with a different luminance than the input. We evaluate several possibilities for texture conditioning and their trade-offs, including luminance, image gradients, and thresholded gradients, the latter of which performed best in maintaining texture and color control simultaneously. Our method can be usefully extended to another practical edit: recoloring patches of an image while respecting the source texture. Our procedure is supported by several qualitative and quantitative evaluations.	公開日:2024-09-21 翻訳日:2024-11-09 14:51:04
# 恒常的ホモロジーランク関数を用いた推論の安定性 Stability for Inference with Persistent Homology Rank Functions ( http://arxiv.org/abs/2307.02904v2 ) ライセンス: Link先を確認	Qiquan Wang, Inés García-Redondo, Pierre Faugère, Gregory Henselman-Petrusek, Anthea Monod,	(参考訳) 永続ホモロジーバーコードとダイアグラムは、点雲、ネットワーク、関数など、幅広い複雑なデータ構造の「形」を捉えたトポロジ的データ解析の基盤である。しかし、その複雑な幾何学的構造のため、統計的な設定での使用は困難である。本稿では,統計と機械学習のツールとして,バーコードと永続化図に数学的に等価な永続的ホモロジーランク関数を再検討する。ランク関数は、関数であり、関数の形でデータに適合する統計の領域である、機能データ分析(FDA)の統計理論の直接的な適用を可能にする。しかし、実際にバーコードに対して提示される重要な課題は、安定性の欠如である。データの忠実な表現としての使用を検証する上で重要な特性であり、したがって実行可能な要約統計量である。本稿では,FDA 統合のための適切な基準の下で,永続的ホモロジーランク関数に対する2つの安定性結果を導出することにより,このギャップを埋める。次に、機能的推論統計学および機械学習におけるランク関数の性能を、単パラメータおよび多パラメータの永続的ホモロジーの両方において、実データアプリケーション上で研究する。階数関数によって捕捉される永続的ホモロジーの使用は、既存の非永続的アプローチよりも明らかな改善をもたらす。 Persistent homology barcodes and diagrams are a cornerstone of topological data analysis that capture the "shape" of a wide range of complex data structures, such as point clouds, networks, and functions. However, their use in statistical settings is challenging due to their complex geometric structure. In this paper, we revisit the persistent homology rank function, which is mathematically equivalent to a barcode and persistence diagram, as a tool for statistics and machine learning. Rank functions, being functions, enable the direct application of the statistical theory of functional data analysis (FDA)-a domain of statistics adapted for data in the form of functions. A key challenge they present over barcodes in practice, however, is their lack of stability-a property that is crucial to validate their use as a faithful representation of the data and therefore a viable summary statistic. In this paper, we fill this gap by deriving two stability results for persistent homology rank functions under a suitable metric for FDA integration. We then study the performance of rank functions in functional inferential statistics and machine learning on real data applications, in both single and multiparameter persistent homology. We find that the use of persistent homology captured by rank functions offers a clear improvement over existing non-persistence-based approaches.	公開日:2024-09-22 翻訳日:2024-11-09 14:51:04
# 二重クープマン回路からの多体カオスの解法モデル Solvable models of many-body chaos from dual-Koopman circuits ( http://arxiv.org/abs/2307.04950v2 ) ライセンス: Link先を確認	Arul Lakshminarayan,	(参考訳) 二重単位回路は、相関関数や状態の時間発展のために正確に解ける多体量子カオスのモデルとして活発に研究されている。ここでは、それらの古典的対応を双対カノニカル変換と関連する双対コオプマン作用素と定義する。それらの量子対と同様に、相関は光円錐上を除いて至る所で消え、そこでは単純な縮約写像によって支配される速度で崩壊する。そのような双対正準変換の大規模なクラスを提供することで、結合された標準写像の例を詳細に研究し、系が混合している熱力学的極限において、可積分ケースから任意に離れていることを解析的に示す。また、光円錐上を含む至る所で相関が消滅する「完全」クープマン作用素を定義し、エルゴード階層の頂点においてベルヌーイ系であると見なされる猫写像格子の例を示す。 Dual-unitary circuits are being vigorously studied as models of many-body quantum chaos that can be solved exactly for correlation functions and time evolution of states. Here we define their classical counterparts as dual-canonical transformations and associated dual-Koopman operators. Like their quantum counterparts, the correlations vanish everywhere except on the light cone, on which they decay with rates governed by a simple contractive map. Providing a large class of such dual-canonical transformations, we study in detail the example of a coupled standard map and show analytically that arbitrarily away from the integrable case, in the thermodynamic limit the system is mixing. We also define ``perfect" Koopman operators that lead to the correlation vanishing everywhere including on the light cone and provide an example of a cat-map lattice which would qualify to be a Bernoulli system at the apex of the ergodic hierarchy.	公開日:2024-09-24 翻訳日:2024-11-09 14:51:04
# 資源制約を考慮した分散パラメータ推定における協調について On Collaboration in Distributed Parameter Estimation with Resource Constraints ( http://arxiv.org/abs/2307.06442v2 ) ライセンス: Link先を確認	Yu-Zhen Janice Chen, Daniel S. Menasché, Don Towsley,	(参考訳) センサネットワーク、IoTシステム、分散コンピューティングにおける効果的なリソース割り当ては、環境監視、監視、スマートインフラストラクチャといったアプリケーションに不可欠である。センサやエージェントはパラメータ推定の精度を最大化するためにリソース割り当てを最適化する必要がある。本研究では,多変量ガウス分布の異なる変数からそれぞれサンプリングし,異なる推定対象を持つセンサ群やエージェント群について考察する。センサやエージェントのデータ収集や協調政策の設計問題をフィッシャー情報最大化(あるいはクレーマー・ラオ境界最小化)問題として定式化する。この定式化は、局所的な単変量サンプルの収集と多変量サンプルの生成の協調の間で、エネルギー利用の新たなトレードオフを捉えている。変数間の相関関係の知識が得られれば,(1)最適なデータ収集ポリシーが協調サンプリングのための情報伝達に資源を投入する,(2)サンプル間の相関関係の知識が推定効率を高めることができない,という2つの事例を解析的に同定する。相関関係の知識は利用できないが, 協調が有益である場合, 逐次分散パラメータ推定問題において, 最適なデータ収集と協調ポリシーを学習するために, マルチアームバンディットアルゴリズムを適用した新しいアプローチを提案する。本稿では,提案アルゴリズムであるDOUBLE-F, DOUBLE-Z, UCB-F, UCB-Zの有効性について述べる。 Effective resource allocation in sensor networks, IoT systems, and distributed computing is essential for applications such as environmental monitoring, surveillance, and smart infrastructure. Sensors or agents must optimize their resource allocation to maximize the accuracy of parameter estimation. In this work, we consider a group of sensors or agents, each sampling from a different variable of a multivariate Gaussian distribution and having a different estimation objective. We formulate a sensor or agent's data collection and collaboration policy design problem as a Fisher information maximization (or Cramer-Rao bound minimization) problem. This formulation captures a novel trade-off in energy use, between locally collecting univariate samples and collaborating to produce multivariate samples. When knowledge of the correlation between variables is available, we analytically identify two cases: (1) where the optimal data collection policy entails investing resources to transfer information for collaborative sampling, and (2) where knowledge of the correlation between samples cannot enhance estimation efficiency. When knowledge of certain correlations is unavailable, but collaboration remains potentially beneficial, we propose novel approaches that apply multi-armed bandit algorithms to learn the optimal data collection and collaboration policy in our sequential distributed parameter estimation problem. We illustrate the effectiveness of the proposed algorithms, DOUBLE-F, DOUBLE-Z, UCB-F, UCB-Z, through simulation.	公開日:2024-09-24 翻訳日:2024-11-09 14:51:04
# 風場の大規模空間補間のための二変量深絞り Bivariate DeepKriging for Large-scale Spatial Interpolation of Wind Fields ( http://arxiv.org/abs/2307.08038v2 ) ライセンス: Link先を確認	Pratik Nag, Ying Sun, Brian J Reich,	(参考訳) 高空間分解能風速データは、気候、海洋学、気象学研究における幅広い応用に不可欠である。 2次元の速度を持つ二変量風の大規模空間補間または下降は、風データが高空間変動と不均一性を有する非ガウス的である傾向があるため、難しい課題である。空間統計学において、コクリギングは二変量空間場を予測するのに一般的に用いられる。しかし、コクリグ予測子はガウス過程を除いて最適ではない。さらに、コクリギングは大規模データセットでは計算が禁じられている。本稿では,2変数空間データ予測のための空間ラジアル基底関数によって構築された埋め込み層を備えた空間依存型ディープニューラルネットワーク(DNN)であるバイバリアレートディープクリグ法を提案する。そこで我々は,ブートストラップとアンサンブルDNNに基づく分布自由不確実性定量化手法を開発した。提案手法は,コリージョン化の線形モデルやフレキシブル二変量Mat\ern共分散などの共分散関数を用いた従来の共分散予測器よりも優れている。提案したDNNモデルの計算効率とスケーラビリティを,従来の手法に比べて平均20倍高速な計算で実証する。両変数のDeepKriging法を中東の506,771箇所の風速データに適用した。提案手法の予測性能はコクリグ予測よりも優れており,計算時間を劇的に短縮する。 High spatial resolution wind data are essential for a wide range of applications in climate, oceanographic and meteorological studies. Large-scale spatial interpolation or downscaling of bivariate wind fields having velocity in two dimensions is a challenging task because wind data tend to be non-Gaussian with high spatial variability and heterogeneity. In spatial statistics, cokriging is commonly used for predicting bivariate spatial fields. However, the cokriging predictor is not optimal except for Gaussian processes. Additionally, cokriging is computationally prohibitive for large datasets. In this paper, we propose a method, called bivariate DeepKriging, which is a spatially dependent deep neural network (DNN) with an embedding layer constructed by spatial radial basis functions for bivariate spatial data prediction. We then develop a distribution-free uncertainty quantification method based on bootstrap and ensemble DNN. Our proposed approach outperforms the traditional cokriging predictor with commonly used covariance functions, such as the linear model of co-regionalization and flexible bivariate Mat\'ern covariance. We demonstrate the computational efficiency and scalability of the proposed DNN model, with computations that are, on average, 20 times faster than those of conventional techniques. We apply the bivariate DeepKriging method to the wind data over the Middle East region at 506,771 locations. The prediction performance of the proposed method is superior over the cokriging predictors and dramatically reduces computation time.	公開日:2024-09-26 翻訳日:2024-11-09 14:51:04
# ポーラメカニクス:三重結合系における光子、マグノン、フォノン Polaromechanics: photons, magnons and phonons in the triple strong-coupling regime ( http://arxiv.org/abs/2307.11328v3 ) ライセンス: Link先を確認	Rui-Chang Shen, Jie Li, Yi-Ming Sun, Wei-Jiang Wu, Xuan Zuo, Yi-Pu Wang, Shi-Yao Zhu, J. Q. You,	(参考訳) ハイブリッド量子システムの構築は、多機能量子技術、量子情報処理、ハイブリッド量子ネットワークを実現するための重要なステップである。関数型ハイブリッド量子系は、その成分間の強い結合を必要とする。しかし、異なる物理系間のカップリングは通常非常に弱い。ハイブリッドシステムにおける強い結合の実験的実現は、特に複数のコンポーネントを持ち、コンポーネントが異なる性質を持つ場合、長年にわたる課題である。ここでは、強結合された強磁性マグノンとマイクロ波光子によって形成される偏光子がフォノンとさらに強く結合する、新しいポーラメカニカルハイブリッドシステムにおける三重結合の実現を実証する。対応する偏光力学の正規モード分割が観察される。 9.4\times10^3$の高偏光力学的協調性は、コヒーレント完全吸収を利用して偏光子崩壊率を著しく減少させることによって達成される。量子コオペラティティがユニティよりもはるかに大きいのは、システムを低温に配置すれば達成できるため、様々な量子応用が可能となる。この結果は、光子、マグノン、フォノンのコヒーレントな量子制御への道を開くものであり、マグノンをベースとした関数型ハイブリッド量子システムを構築するための重要なステップである。 Building hybrid quantum systems is a crucial step for realizing multifunctional quantum technologies, quantum information processing, and hybrid quantum networks. A functional hybrid quantum system requires strong coupling among its components. However, couplings between distinct physical systems are typically very weak. Experimental realization of strong coupling in a hybrid system remains a long-standing challenge, especially when it has multiple components and the components are of different nature. Here we demonstrate the realization of triple strong coupling in a novel polaromechanical hybrid system, where polaritons, formed by strongly coupled ferromagnetic magnons and microwave photons, are further strongly coupled to phonons. The corresponding polaromechanical normal-mode splitting is observed. A high polaromechanical cooperativity of $9.4\times10^3$ is achieved by significantly reducing the polariton decay rate via exploiting coherent perfect absorption. The quantum cooperativity much greater than unity is achievable if placing the system at low bath temperatures, which would enable various quantum applications. Our results pave the way towards coherent quantum control of photons, magnons and phonons, and are a crucial step for building functional hybrid quantum systems based on magnons.	公開日:2024-09-27 翻訳日:2024-11-09 14:51:04
# 複素数を持つ論理ゲートについて On Logic Gates with Complex Numbers ( http://arxiv.org/abs/2307.12905v6 ) ライセンス: Link先を確認	M. W. AlMasri,	(参考訳) 論理ゲートは複素微分作用素の言葉で書くことができ、入力と出力は複数の変数を持つ正則函数である。複素数の極表現を用いて、系の振動挙動と論理ゲートの間の即時接続に到達する。様々な計算システムにおけるこの形式主義の普遍性について論じる。 Logic gates can be written in terms of complex differential operators, where the inputs and outputs are holomorphic functions with several variables. Using the polar representation of complex numbers, we arrive at an immediate connection between the oscillatory behavior of the system and logic gates. We discuss the universality of this formalism in a variety of computing systems.	公開日:2024-10-10 翻訳日:2024-11-09 14:51:04
# 個人差分重み付き経験的リスク最小化手法とその出力重み付き学習への応用 A Differentially Private Weighted Empirical Risk Minimization Procedure and its Application to Outcome Weighted Learning ( http://arxiv.org/abs/2307.13127v2 ) ライセンス: Link先を確認	Spencer Giddens, Yiwang Zhou, Kevin R. Krull, Tara M. Brinkman, Peter X. K. Song, Fang Liu,	(参考訳) 個人情報を含むデータを用いて、経験的リスク最小化(ERM)の枠組みで予測モデルを構築するのが一般的である。これらのモデルは予測には非常に正確であるが、機密性の高いデータに基づいてトレーニングされたこれらのモデルの結果を共有することは、プライバシ攻撃の影響を受けやすい。差分プライバシー(DP)は、機密データから情報を公開する際に生じるプライバシー損失に数学的に証明可能な境界を提供することによって、そのようなデータプライバシー問題に対処するための魅力的なフレームワークである。これまでの作業は主に、未加重ERMにDPを適用することに集中してきた。重み付きERM (wERM) は, 目的関数に対する個々の貢献を様々な重みに割り当てることができる重要な一般化である。一般のwERMに対する最初の微分プライベートアルゴリズムを提案し、理論DPを保証する。既存のDP-ERMプロシージャをwERMに拡張することで、一般的な結果重み付き学習(OWL)を含む個別の処理ルールに対するプライバシー保護学習手法を導出する道が形成される。シミュレーションおよび実際の臨床試験において,OWLに適用したDP-wERMフレームワークの性能評価を行った。実験結果はすべて、十分な堅牢なモデル性能を維持しつつ、DP保証付きwERMによるOWLモデルのトレーニングが可能であることを示し、センシティブなデータを含む現実のシナリオにおいて、提案したプライバシ保存OWLプロシージャの実装の実用性を示す強力な証拠を提供する。 It is common practice to use data containing personal information to build predictive models in the framework of empirical risk minimization (ERM). While these models can be highly accurate in prediction, sharing the results from these models trained on sensitive data may be susceptible to privacy attacks. Differential privacy (DP) is an appealing framework for addressing such data privacy issues by providing mathematically provable bounds on the privacy loss incurred when releasing information from sensitive data. Previous work has primarily concentrated on applying DP to unweighted ERM. We consider weighted ERM (wERM), an important generalization, where each individual's contribution to the objective function can be assigned varying weights. We propose the first differentially private algorithm for general wERM, with theoretical DP guarantees. Extending the existing DP-ERM procedures to wERM creates a pathway for deriving privacy-preserving learning methods for individualized treatment rules, including the popular outcome weighted learning (OWL). We evaluate the performance of the DP-wERM framework applied to OWL in both simulation studies and in a real clinical trial. All empirical results demonstrate the feasibility of training OWL models via wERM with DP guarantees while maintaining sufficiently robust model performance, providing strong evidence for the practicality of implementing the proposed privacy-preserving OWL procedure in real-world scenarios involving sensitive data.	公開日:2024-09-27 翻訳日:2024-11-09 14:51:04
# 局所アドレス性に制限のある中性原子デバイスにおける回路分解とスケジューリング Circuit decompositions and scheduling for neutral atom devices with limited local addressability ( http://arxiv.org/abs/2307.14996v2 ) ライセンス: Link先を確認	Natalia Nottingham, Michael A. Perlin, Dhirpal Shah, Ryan White, Hannes Bernien, Frederic T. Chong, Jonathan M. Baker,	(参考訳) 中性原子ハードウェア技術の進歩は続いているが、中性原子量子コンピュータの課題を克服するために設計されたシステムレベルのソフトウェアでは、まだ開発が限られている。特に、現在の中性原子アーキテクチャのほとんどは、ブロッホ球のxy平面の軸付近の1量子ビット回転の局所的なアドレッシングをネイティブにサポートしていない。代わりに、これらは全てのキュービットに同時に適用されるグローバルビームを介して実行される。従来の中性原子実験では、操作の短いシーケンスをこのネイティブゲートセットに変換する単純な合成法を使用していたが、これらの方法はシステムレベルのフレームワークに組み込むことも、非現実的なシリアライゼーションの量を課すことなく、回路全体に適用することもできない。十分なコンパイラ最適化がなければ、グローバルゲートを含む分解は回路深さ、ゲート数、エラーの蓄積を大幅に増加させる。この問題に対処する以前のコンパイラ作業はなく、この問題を解決するために既存のコンパイラを適用するのは簡単ではない。本稿では,任意のゲートセットからグローバルゲートを含むリアルな中性原子ネイティブゲートセットに入力回路を変換する最適化コンパイラパイプラインを提案する。最終回路のグローバルゲート数と全グローバルローテーション量を最小限に抑える分解とスケジューリングに焦点をあてる。示すように、これらのコストは、他のゲートタイプによるコストと比較して、回路の持続時間と全体的な誤差に最も寄与する。コンパイラパイプラインの最適化されていないバージョンと比較して、グローバルゲートコストの最小化は、回路長の最大4.77倍のスピードアップをもたらす。従来の作業と比べ、最大53.8倍のスピードアップを実現しています。大型回路では,回路の忠実度が若干向上している。 Despite major ongoing advancements in neutral atom hardware technology, there remains limited work in systems-level software tailored to overcoming the challenges of neutral atom quantum computers. In particular, most current neutral atom architectures do not natively support local addressing of single-qubit rotations about an axis in the xy-plane of the Bloch sphere. Instead, these are executed via global beams applied simultaneously to all qubits. While previous neutral atom experimental work has used straightforward synthesis methods to convert short sequences of operations into this native gate set, these methods cannot be incorporated into a systems-level framework nor applied to entire circuits without imposing impractical amounts of serialization. Without sufficient compiler optimizations, decompositions involving global gates will significantly increase circuit depth, gate count, and accumulation of errors. No prior compiler work has addressed this, and adapting existing compilers to solve this problem is nontrivial. In this paper, we present an optimized compiler pipeline that translates an input circuit from an arbitrary gate set into a realistic neutral atom native gate set containing global gates. We focus on decomposition and scheduling passes that minimize the final circuit's global gate count and total global rotation amount. As we show, these costs contribute the most to the circuit's duration and overall error, relative to costs incurred by other gate types. Compared to the unoptimized version of our compiler pipeline, minimizing global gate costs gives up to 4.77x speedup in circuit duration. Compared to the closest prior existing work, we achieve up to 53.8x speedup. For large circuits, we observe a few orders of magnitude improvement in circuit fidelities.	公開日:2024-09-23 翻訳日:2024-11-09 14:51:04
# RoboDepth Challenge:ロバスト深さ推定に向けた手法と進歩 The RoboDepth Challenge: Methods and Advancements Towards Robust Depth Estimation ( http://arxiv.org/abs/2307.15061v2 ) ライセンス: Link先を確認	Lingdong Kong, Yaru Niu, Shaoyuan Xie, Hanjiang Hu, Lai Xing Ng, Benoit R. Cottereau, Liangjun Zhang, Hesheng Wang, Wei Tsang Ooi, Ruijie Zhu, Ziyang Song, Li Liu, Tianzhu Zhang, Jun Yu, Mohan Jing, Pengwei Li, Xiaohua Qi, Cheng Jin, Yingfeng Chen, Jie Hou, Jie Zhang, Zhen Kan, Qiang Ling, Liang Peng, Minglei Li, Di Xu, Changpeng Yang, Yuanqi Yao, Gang Wu, Jian Kuai, Xianming Liu, Junjun Jiang, Jiamian Huang, Baojun Li, Jiale Chen, Shuang Zhang, Sun Ao, Zhenyu Li, Runze Chen, Haiyong Luo, Fang Zhao, Jingze Yu,	(参考訳) 悪天候, センサ故障, 騒音汚染など, アウト・オブ・ディストリビューション(OoD)のシナリオ下での正確な深度推定は, 安全クリティカルな応用に望ましい。しかし、既存の深度推定システムは、必然的に現実世界の腐敗や摂動に悩まされ、そのような場合の信頼性の高い深度予測に苦慮している。本稿では,頑健なOoD深度推定を容易にすることを目的とした学術コンペであるRoboDepth Challengeの優勝ソリューションを要約する。この問題は、新たに確立されたKITTI-CとNYUDepth2-Cベンチマークに基づいて開発された。我々は2つのスタンドアローントラックをホストし、それぞれ、頑健な自己監督と頑健な完全教師付き深度推定に重点を置いていた。 200人を超える参加者のうち、9つの独特で最高のソリューションが登場し、空間領域と周波数領域の強化、マスク付き画像モデリング、画像復元と超高解像度化、対向訓練、拡散に基づくノイズ抑圧、視覚言語による事前学習、学習モデルエンハンスブル、階層的特徴強化など、新しい設計がなされている。各設計の背景にある理論的根拠をよりよく理解するために、総合的な実験分析と洞察に富んだ観察を描いている。この課題が、堅牢で信頼性の高い深度推定などに関する将来の研究の確固たる基盤となることを願っている。データセット、競争ツールキット、ワークショップ記録、優勝チームのソースコードは、チャレンジウェブサイトで公開されている。 Accurate depth estimation under out-of-distribution (OoD) scenarios, such as adverse weather conditions, sensor failure, and noise contamination, is desirable for safety-critical applications. Existing depth estimation systems, however, suffer inevitably from real-world corruptions and perturbations and are struggled to provide reliable depth predictions under such cases. In this paper, we summarize the winning solutions from the RoboDepth Challenge -- an academic competition designed to facilitate and advance robust OoD depth estimation. This challenge was developed based on the newly established KITTI-C and NYUDepth2-C benchmarks. We hosted two stand-alone tracks, with an emphasis on robust self-supervised and robust fully-supervised depth estimation, respectively. Out of more than two hundred participants, nine unique and top-performing solutions have appeared, with novel designs ranging from the following aspects: spatial- and frequency-domain augmentations, masked image modeling, image restoration and super-resolution, adversarial training, diffusion-based noise suppression, vision-language pre-training, learned model ensembling, and hierarchical feature enhancement. Extensive experimental analyses along with insightful observations are drawn to better understand the rationale behind each design. We hope this challenge could lay a solid foundation for future research on robust and reliable depth estimation and beyond. The datasets, competition toolkit, workshop recordings, and source code from the winning teams are publicly available on the challenge website.	公開日:2024-09-24 翻訳日:2024-11-09 14:40:04
# 半無限導波路と結合した原子に基づく量子コヒーレント及び測定フィードバック制御 Quantum coherent and measurement feedback control based on atoms coupled with a semi-infinite waveguide ( http://arxiv.org/abs/2307.16876v4 ) ライセンス: Link先を確認	Haijin Ding, Nina H. Amini, Guofeng Zhang, John E. Gough,	(参考訳) 本稿では,複数の2レベル原子を結合した半無限導波路に基づく原子・フォトニック系の所望の状態を生成するために,量子フィードバック制御が適用可能であることを示す。このセットアップでは、初期励起原子が導波路に1つの光子を放出し、終端ミラーや他の原子によって反射され、原子と光子のコヒーレント相互作用を介して異なるフィードバックループを確立することができる。導波管量子電磁力学(導波管QED)系に少なくとも2つの励起が存在する場合、量子状態の進化はランダムグラフ理論を用いて解釈できる。このプロセスは環境の影響を受けながら,計測に基づくフィードバック制御やコヒーレントドライブによって環境誘起のダイナミクスを排除できることを明らかにする。したがって、オープン系原子-導波路相互作用において、測定に基づくフィードバックは最終的な定常量子状態を変調することができ、同時に、測定プロセスにおけるホモダイン検出ノイズは振動を誘発し、コヒーレントなフィードバック設計によって処理される。 In this paper, we show that quantum feedback control may be applied to generate desired states for atomic and photonic systems based on a semi-infinite waveguide coupled with multiple two-level atoms. In this set-up, an initially excited atom can emit one photon into the waveguide, which can be reflected by the terminal mirror or other atoms to establish different feedback loops via the coherent interactions between the atom and photon. When there are at most two excitations in the waveguide quantum electrodynamics (waveguide QED) system, the evolution of quantum states can be interpreted using random graph theory. While this process is influenced by the environment, and we clarify that the environment-induced dynamics can be eliminated by measurement-based feedback control or coherent drives. Thus, in the open system atom-waveguide interactions, measurement-based feedback can modulate the final steady quantum state, while simultaneously, the homodyne detection noise in the measurement process can induce oscillations, which is treated by the coherent feedback designs.	公開日:2024-09-24 翻訳日:2024-11-09 14:40:04
# 複数の固有値の位相シミュレーションのためのチャネルベースフレームワーク Channel-based framework for phase esimation of multiple eigenvalues ( http://arxiv.org/abs/2308.02307v2 ) ライセンス: Link先を確認	Yuan-De Jin, Shi-Yu Zhang, Wen-Long Ma,	(参考訳) ターゲット量子系上のユニタリ演算子の固有値の量子位相推定(QPE)は、様々な量子アルゴリズムにおいて重要なサブルーチンである。従来のQPEは、多くのアンシラ量子ビットと量子フーリエ変換を実行する能力を必要とするため、実装に費用がかかることが多い。反復QPEの最近の進歩は、単一アンシラと古典的な後処理を繰り返し使用することにより、実装コストを削減している。しかし、従来型と反復型の両方のスキームでは、ユニタリ演算子の固有状態におけるターゲットシステムの準備が要求されるが、初期状態の準備を必要とせずに複数の固有値のQPEを達成することはあいまいである。ここでは、反復QPEのための逐次量子チャネルに基づく理論的枠組みを開発することにより、この問題を明らかにする。複数固有値のQPEを任意の初期目標系状態に対して効率よく実現し, 目標系における反復QPEの測定バックアクションを長いコヒーレンス時間で有効に活用できることを見出した。具体的には、アンシラ量子ビットの逐次ラムゼー干渉計測(RIM)に基づく2つの反復QPEスキームについて検討する。 (a) 固有値を推定する際の標準量子極限を達成するために反復RIMを実行する反復スキーム b) ハイゼンベルク限界に達するための事前測定結果に基づいて各RIMのパラメータを調整する適応型スキーム。どちらのスキームにおいても、連続的なアンシラ測定はターゲットシステム上で逐次的な量子チャネルを生成し、それを推定されたユニタリ演算子の固有状態に徐々にステアリングする一方、アンシラの測定統計は適切な後処理でその固有値に関する埋め込み情報を明らかにすることができる。本研究では, 中心スピンモデルを用いて解析を行い, 両スキームの性能と耐雑音性を評価する。 Quantum phase estimation (QPE) of the eigenvalues of a unitary operator on a target quantum system is a crucial subroutine in various quantum algorithms. Conventional QPE is often expensive to implement as it requires a large number of ancilla qubits and the ability to perform quantum Fourier transform. Recent developments in iterative QPE reduce the implementation cost by repetitive uses of a single ancilla and classical post-processing. However, both conventional and iterative schemes often require preparation of the target system in an eigenstate of the unitary operator, while it remains ambiguous to achieve QPE of multiple eigenvalues with no need of initial state preparation. Here we clarify this issue by developing a theoretical framework based on sequential quantum channels for iterative QPE. We find that QPE of multiple eigenvalues can be efficiently realized for arbitrary initial target system state by actively utilizing the measurement backaction of iterative QPE on the target system with a long coherence time. Specifically, we investigate two iterative QPE schemes based on sequential Ramsey interferometry measurements (RIMs) of an ancilla qubit: (a) the repetitive scheme, which conducts repetitive RIMs to achieve the standard quantum limit in estimating the eigenvalues; (b) the adaptive scheme, which adjusts the parameters of each RIM based on prior measurement outcomes to attain the Heisenberg limit. In both schemes, sequential ancilla measurements generate sequential quantum channels on the target system, gradually steering it to the eigenstates of the estimated unitary operator, while the measurement statistics of the ancilla can reveal the embedded information about its eigenvalues with proper post-processing. We demonstrate the analysis by simulating a central spin model, and evaluate the performance and noise resilience of both schemes.	公開日:2024-09-27 翻訳日:2024-11-09 14:40:04
# TempFuser: 長期の短期核融合変換器を使って、アジャイル、戦術、およびアクロバティックな飛行マニアを学ぶ TempFuser: Learning Agile, Tactical, and Acrobatic Flight Maneuvers Using a Long Short-Term Temporal Fusion Transformer ( http://arxiv.org/abs/2308.03257v4 ) ライセンス: Link先を確認	Hyunki Seong, David Hyunchul Shim,	(参考訳) ドッグファイティングは、戦略的操作とアジャイル航空機の空気力学の両方を包括的に理解する必要がある航空アプリケーションにおいて難しいシナリオである。航空エージェントは、長期的視点から戦闘機の戦術的に進化する操縦を理解できるだけでなく、短期的な視点から航空機の空気力学を急速に変化させることも必要である。本稿では, 複雑なドッグファイト問題におけるアジャイル, 戦術的, アクロバティックな飛行操作を学習できる, 時間的長期統合型トランスフォーマーアーキテクチャである TempFuser を紹介する。当社のアプローチでは、2つの異なる時間的遷移の埋め込みをトランスフォーマーベースのネットワークに統合し、航空エージェントの長期的戦術と短期的機敏性の両方を包括的に捉える。これらの視点を取り入れることで、当社のポリシネットワークは、長期にわたって支配的な位置を確保し、効果的にアジャイル反対者を上回る、エンドツーエンドのフライトコマンドを生成します。高忠実度飛行シミュレーターで訓練した後、我々のモデルは戦略的な操作をうまく学習し、様々な種類の敵機に対して基本方針モデルより優れた性能を発揮する。特に,本モデルでは,先行知識を必要とせず,優れた仕様の敵に面しても,人間のようなアクロバティックな操作が可能である。さらに,超音速・低高度の課題において,強靭な追尾性能を示す。デモビデオはhttps://sites.google.com/view/tempfuser.comで公開されている。 Dogfighting is a challenging scenario in aerial applications that requires a comprehensive understanding of both strategic maneuvers and the aerodynamics of agile aircraft. The aerial agent needs to not only understand tactically evolving maneuvers of fighter jets from a long-term perspective but also react to rapidly changing aerodynamics of aircraft from a short-term viewpoint. In this paper, we introduce TempFuser, a novel long short-term temporal fusion transformer architecture that can learn agile, tactical, and acrobatic flight maneuvers in complex dogfight problems. Our approach integrates two distinct temporal transition embeddings into a transformer-based network to comprehensively capture both the long-term tactics and short-term agility of aerial agents. By incorporating these perspectives, our policy network generates end-to-end flight commands that secure dominant positions over the long term and effectively outmaneuver agile opponents. After training in a high-fidelity flight simulator, our model successfully learns to execute strategic maneuvers, outperforming baseline policy models against various types of opponent aircraft. Notably, our model exhibits human-like acrobatic maneuvers even when facing adversaries with superior specifications, all without relying on prior knowledge. Moreover, it demonstrates robust pursuit performance in challenging supersonic and low-altitude situations. Demo videos are available at https://sites.google.com/view/tempfuser.	公開日:2024-09-25 翻訳日:2024-11-09 14:40:04
# 量子コンピュータのためのファジィゲージ理論 Fuzzy gauge theory for quantum computers ( http://arxiv.org/abs/2308.05253v4 ) ライセンス: Link先を確認	Andrei Alexandru, Paulo F. Bedaque, Andrea Carosso, Michael J. Cervia, Edison M. Murairi, Andy Sheng,	(参考訳) 連続ゲージ理論は、そのボゾン次数により、無限次元局所ヒルベルト空間を持つ。量子ビットベースのハードウェア上でこれらの自由度を符号化するには、有限個の自由度しか使わずに理論の振舞いを近似するある種の「量子化」スキームが必要である。ファジィゲージ理論 (fuzzy gauge theory) と呼ばれるゲージ理論に対する新しい量子化戦略を提案し、ファジィ$\sigma$-モデルの成功に基づく。ファジィゲージ理論は正規ゲージ理論と同じ普遍性クラスに属し、その場合、通常の空間連続極限以外のいかなる極限も必要としない。さらに,これらのモデルが量子シミュレーションにおいて比較的資源効率が高いことを示す。 Continuous gauge theories, because of their bosonic degrees of freedom, have an infinite-dimensional local Hilbert space. Encoding these degrees of freedom on qubit-based hardware demands some sort of ``qubitization'' scheme, where one approximates the behavior of a theory while using only finitely many degrees of freedom. We propose a novel qubitization strategy for gauge theories, called ``fuzzy gauge theory,'' building on the success of the fuzzy $\sigma$-model in earlier work. We provide arguments that the fuzzy gauge theory lies in the same universality class as regular gauge theory, in which case its use would obviate the need of any further limit besides the usual spatial continuum limit. Furthermore, we demonstrate that these models are relatively resource-efficient for quantum simulations.	公開日:2024-09-24 翻訳日:2024-11-09 14:40:04
# CyberForce: マルウェア除去のためのフェデレーション強化学習フレームワーク CyberForce: A Federated Reinforcement Learning Framework for Malware Mitigation ( http://arxiv.org/abs/2308.05978v3 ) ライセンス: Link先を確認	Chao Feng, Alberto Huertas Celdran, Pedro Miguel Sanchez Sanchez, Jan Kreischer, Jan von der Assen, Gerome Bovet, Gregorio Martinez Perez, Burkhard Stiller,	(参考訳) 近年の研究では、強化学習(RL)と移動目標防衛(MTD)の統合により、IoT(Internet-of-Things)デバイスにおけるサイバーセキュリティが向上することが示されている。それでも、既存の作業の実践性は、RLにおける集中型データ処理に関連するデータプライバシの懸念や、不均一なゼロデイ攻撃の増加に対して有効な適切なMTD技術を学ぶのに必要な不満足な時間によって妨げられている。この研究は、フェデレーションと強化学習(FRL)を組み合わせたフレームワークであるCyberForceを紹介し、ゼロデイ攻撃を緩和するための適切なMTDテクニックを共同でプライベートに学習する。 CyberForceはデバイスフィンガープリントと異常検出を統合して、FRLベースのエージェントによって選択されたMTDメカニズムを報酬または罰する。このフレームワークは、異種マルウェアのサンプルに影響された実際のIoTプラットフォームの10の物理デバイスで構成されたシナリオでデプロイされ、評価されている。実験のプールは、CyberForceが既存のRLベースの集中型アプローチよりも高速に攻撃を緩和するMTD技術を学ぶことを示した。さらに、様々なデバイスが異なる攻撃にさらされると、CyberForceは知識伝達の恩恵を受け、性能が向上し、最近の研究と比べて学習時間が短縮される。最後に、エージェント学習プロセスで使用される異なる集約アルゴリズムは、CyberForceに悪意のある攻撃に対する顕著な堅牢性を提供する。 Recent research has shown that the integration of Reinforcement Learning (RL) with Moving Target Defense (MTD) can enhance cybersecurity in Internet-of-Things (IoT) devices. Nevertheless, the practicality of existing work is hindered by data privacy concerns associated with centralized data processing in RL, and the unsatisfactory time needed to learn right MTD techniques that are effective against a rising number of heterogeneous zero-day attacks. Thus, this work presents CyberForce, a framework that combines Federated and Reinforcement Learning (FRL) to collaboratively and privately learn suitable MTD techniques for mitigating zero-day attacks. CyberForce integrates device fingerprinting and anomaly detection to reward or penalize MTD mechanisms chosen by an FRL-based agent. The framework has been deployed and evaluated in a scenario consisting of ten physical devices of a real IoT platform affected by heterogeneous malware samples. A pool of experiments has demonstrated that CyberForce learns the MTD technique mitigating each attack faster than existing RL-based centralized approaches. In addition, when various devices are exposed to different attacks, CyberForce benefits from knowledge transfer, leading to enhanced performance and reduced learning time in comparison to recent works. Finally, different aggregation algorithms used during the agent learning process provide CyberForce with notable robustness to malicious attacks.	公開日:2024-09-30 翻訳日:2024-11-09 14:40:04
# BehaVR:VRセンサデータに基づくユーザ識別 BehaVR: User Identification Based on VR Sensor Data ( http://arxiv.org/abs/2308.07304v2 ) ライセンス: Link先を確認	Ismat Jarin, Yu Duan, Rahmadi Trimananda, Hao Cui, Salma Elmalaki, Athina Markopoulou,	(参考訳) しかし、仮想現実(VR)プラットフォームは幅広いアプリケーションを可能にするが、ユニークなプライバシーリスクを生じさせる。特にVRデバイスには、個人的かつ機密性の高い情報(例えば、身体の動き、視線、手関節、表情など)を収集する、豊富なセンサーが備わっている。これらの新しいセンサーのデータは、明示的な識別子がなくても、ユーザーをユニークに識別するために使用することができる。本稿では,VRセンサデータのみに基づいて,さまざまなジャンルの現実世界のアプリ内外において,ユーザが特定できる範囲を理解することを目的とする。ひとつのアプリ(アプリ)で利用可能なAPIの観察から、複数のアプリ(デバイス)にまたがるすべてのまたは選択されたセンサ計測まで、さまざまな機能を持つ敵について検討する。そのために、BehaVRを紹介した。BehaVRは、VRデバイス上で実行される複数のアプリによって収集されたすべてのセンサグループからのデータを収集し、分析するフレームワークである。私たちはBehaVRを使って、20の人気のある現実世界のアプリと対話する実際のユーザーからデータを収集しています。そのデータを使って、アプリ内およびアプリ間のユーザ識別のための機械学習モデルを構築し、利用可能なセンサデータから機能を抽出します。これらのモデルがユーザを最大100%の精度で識別できることを示し、アプリや敵の機能に応じて、最も重要な機能やセンサグループを明らかにする。私たちの知る限りでは、BehaVRはVRにおけるユーザー識別を包括的に分析する最初の企業である。 Virtual reality (VR) platforms enable a wide range of applications, however, pose unique privacy risks. In particular, VR devices are equipped with a rich set of sensors that collect personal and sensitive information (e.g., body motion, eye gaze, hand joints, and facial expression). The data from these newly available sensors can be used to uniquely identify a user, even in the absence of explicit identifiers. In this paper, we seek to understand the extent to which a user can be identified based solely on VR sensor data, within and across real-world apps from diverse genres. We consider adversaries with capabilities that range from observing APIs available within a single app (app adversary) to observing all or selected sensor measurements across multiple apps on the VR device (device adversary). To that end, we introduce BehaVR, a framework for collecting and analyzing data from all sensor groups collected by multiple apps running on a VR device. We use BehaVR to collect data from real users that interact with 20 popular real-world apps. We use that data to build machine learning models for user identification within and across apps, with features extracted from available sensor data. We show that these models can identify users with an accuracy of up to 100%, and we reveal the most important features and sensor groups, depending on the functionality of the app and the adversary. To the best of our knowledge, BehaVR is the first to analyze user identification in VR comprehensively, i.e., considering all sensor measurements available on consumer VR devices, collected by multiple real-world, as opposed to custom-made, apps.	公開日:2024-09-23 翻訳日:2024-11-09 14:40:04
# コードLLMのための高リソースから低リソースプログラミング言語への知識伝達 Knowledge Transfer from High-Resource to Low-Resource Programming Languages for Code LLMs ( http://arxiv.org/abs/2308.09895v6 ) ライセンス: Link先を確認	Federico Cassano, John Gouwar, Francesca Lucchetti, Claire Schlesinger, Anders Freeman, Carolyn Jane Anderson, Molly Q Feldman, Michael Greenberg, Abhinav Jangda, Arjun Guha,	(参考訳) ここ数年、Large Language Models of Code (Code LLMs) はプログラミングの実践に大きな影響を与え始めています。プログラミング言語やソフトウェア工学の研究のためのビルディングブロックとして、コードLLMが登場している。しかし、Code LLMはトレーニングデータ(例えば、Java、Python、JavaScript)でよく表現されているが、トレーニングデータに制限のある低リソースの言語では苦労しているプログラミング言語に対して印象的な結果をもたらす。低リソース言語にはOCaml、Racket、その他いくつかのものがある。本稿では,半合成データを用いた低リソース言語上でのコードLLMの性能向上に有効な手法を提案する。我々のアプローチであるMultiPL-Tは、ハイソース言語からのトレーニングデータを、以下の方法で低リソース言語のトレーニングデータに変換する。 1) Code LLMを使用して、高ソース言語からのコメント付きコードのテストの合成を行い、欠陥のあるテストとテストカバレッジの低いコードをフィルタリングします。 2) コードLLMを使用してPythonコードをターゲットとする低リソース言語に翻訳し,テストを使用して翻訳を検証する。このアプローチを適用して,Julia,Lua,OCaml,R,Racketの各トレーニング項目を数万個生成する。さらに、オープンなトレーニングデータ(The Stack)を備えたオープンモデル(StarCoderBase)を使用することで、ベンチマークの削除や、ライセンスに違反することなくモデルをトレーニングし、それ以外の方法では不可能な実験を実行することが可能になります。 MultiPL-T 生成データを用いて,Julia,Lua,OCaml,R,Racket 用の StarCoderBase と Code Llama の微調整版を提示する。確立されたベンチマーク(MultiPL-E)では、これらのモデルは他のオープンコードLLMよりも優れている。 MultiPL-Tアプローチは、新しい言語に簡単に適用でき、トレーニングのような代替手段よりもはるかに効率的で効果的である。 Over the past few years, Large Language Models of Code (Code LLMs) have started to have a significant impact on programming practice. Code LLMs are also emerging as building blocks for research in programming languages and software engineering. However, Code LLMs produce impressive results on programming languages that are well represented in their training data (e.g., Java, Python, or JavaScript), but struggle with low-resource languages that have limited training data available. Low resource languages include OCaml, Racket, and several others. This paper presents an effective approach for boosting the performance of Code LLMs on low-resource languages using semi-synthetic data. Our approach, MultiPL-T, translates training data from high-resource languages into training data for low-resource languages in the following way. 1) We use a Code LLM to synthesize tests for commented code from a high-resource language, filtering out faulty tests and code with low test coverage. 2) We use a Code LLM to translate Python code to a target low-resource language, and use tests to validate the translation. We apply this approach to generate tens of thousands of validated training items for Julia, Lua, OCaml, R, and Racket. Furthermore, we use an open model (StarCoderBase) with open training data (The Stack), which allows us to decontaminate benchmarks, train models without violating licenses, and run experiments that could not otherwise be done. With MultiPL-T generated data, we present fine-tuned versions of StarCoderBase and Code Llama for Julia, Lua, OCaml, R, and Racket. On established benchmarks (MultiPL-E), these models outperform other open Code LLMs. The MultiPL-T approach is easy to apply to new languages, and is significantly more efficient and effective than alternatives such as training longer.	公開日:2024-09-22 翻訳日:2024-11-09 14:40:04
# ChatEDA:EDAのための大規模言語モデル駆動自律エージェント ChatEDA: A Large Language Model Powered Autonomous Agent for EDA ( http://arxiv.org/abs/2308.10204v4 ) ライセンス: Link先を確認	Zhuolun He, Haoyuan Wu, Xinyun Zhang, Xufeng Yao, Su Zheng, Haisheng Zheng, Bei Yu,	(参考訳) 相互運用性を高めるための複雑な電子設計自動化(EDA)ツールの統合は、回路設計者にとって重要な関心事である。大規模言語モデル(LLM)の最近の進歩は、自然言語処理と理解において、EDAツールと対面する新しいアプローチを提供する、優れた能力を示した。本稿では,LEM,AutoMageによって権限を付与されたEDAの自律エージェントであるChatEDAを紹介し,執行役としてのEDAツールを補完する。 ChatEDAは、タスク分解、スクリプト生成、タスク実行を効果的に管理することで、登録-転送レベル(RTL)からグラフデータシステムバージョンII(GDSII)への設計フローを合理化する。総合的な実験評価を通じて,ChatEDAは多様な要求に対処する能力を示し,我々の微調整オートマージモデルはGPT-4や他のLLMと比較して優れた性能を示した。 The integration of a complex set of Electronic Design Automation (EDA) tools to enhance interoperability is a critical concern for circuit designers. Recent advancements in large language models (LLMs) have showcased their exceptional capabilities in natural language processing and comprehension, offering a novel approach to interfacing with EDA tools. This research paper introduces ChatEDA, an autonomous agent for EDA empowered by an LLM, AutoMage, complemented by EDA tools serving as executors. ChatEDA streamlines the design flow from the Register-Transfer Level (RTL) to the Graphic Data System Version II (GDSII) by effectively managing task decomposition, script generation, and task execution. Through comprehensive experimental evaluations, ChatEDA has demonstrated its proficiency in handling diverse requirements, and our fine-tuned AutoMage model has exhibited superior performance compared to GPT-4 and other similar LLMs.	公開日:2024-09-21 翻訳日:2024-11-09 14:40:04
# ローカル・ミニマを飛び抜ける:視覚変換器の失われた景観の量子化 Jumping through Local Minima: Quantization in the Loss Landscape of Vision Transformers ( http://arxiv.org/abs/2308.10814v3 ) ライセンス: Link先を確認	Natalia Frumkin, Dibakar Gope, Diana Marculescu,	(参考訳) 量子化スケールとビット幅は、ニューラルネットワークの量子化方法を考える上で最も重要なパラメータである。先行研究は、勾配法 (gradient descent \& Hessian analysis) を通じて、グローバルな方法で量子化スケールを最適化することに焦点を当てている。しかし、量子化スケールに摂動を適用すると、非常にジャグリングされ、非常に滑らかなテスト損失の風景が観察される。実際、量子化スケールでの小さな摂動は精度に大きな影響を与え、4ビット量子化ビジョントランス (ViT) において0.5-0.8\%の精度向上をもたらす。この体制では、勾配法は局所最小値に確実に到達できないため、崩壊する。 Evol-Qと呼ばれる我々の研究では、進化的探索を用いて非滑らかな風景を効果的に横断する。さらに我々は,小キャリブレーションデータセット(1,000ドル画像)のオーバーフィッティングに有効であるだけでなく,そのような非滑らかな表面のトラバースを容易にするインフォネッセロスを提案する。 Evol-Q は完全量子化された ViT-Base のトップ-1 の精度を 10.30 %$,$0.78 %$,$0.15 %$ で3$-bit,$4$-bit,$8$-bit で改善している。様々なCNNおよびViTアーキテクチャに関する大規模な実験は、極端量子化シナリオにおけるその堅牢性をさらに証明している。私たちのコードはhttps://github.com/enyac-group/evol-qで利用可能です。 Quantization scale and bit-width are the most important parameters when considering how to quantize a neural network. Prior work focuses on optimizing quantization scales in a global manner through gradient methods (gradient descent \& Hessian analysis). Yet, when applying perturbations to quantization scales, we observe a very jagged, highly non-smooth test loss landscape. In fact, small perturbations in quantization scale can greatly affect accuracy, yielding a $0.5-0.8\%$ accuracy boost in 4-bit quantized vision transformers (ViTs). In this regime, gradient methods break down, since they cannot reliably reach local minima. In our work, dubbed Evol-Q, we use evolutionary search to effectively traverse the non-smooth landscape. Additionally, we propose using an infoNCE loss, which not only helps combat overfitting on the small calibration dataset ($1,000$ images) but also makes traversing such a highly non-smooth surface easier. Evol-Q improves the top-1 accuracy of a fully quantized ViT-Base by $10.30\%$, $0.78\%$, and $0.15\%$ for $3$-bit, $4$-bit, and $8$-bit weight quantization levels. Extensive experiments on a variety of CNN and ViT architectures further demonstrate its robustness in extreme quantization scenarios. Our code is available at https://github.com/enyac-group/evol-q	公開日:2024-09-26 翻訳日:2024-11-09 14:40:04
# 時空間グラフ条件拡散モデルを用いた多変量時系列異常検出 Contaminated Multivariate Time-Series Anomaly Detection with Spatio-Temporal Graph Conditional Diffusion Models ( http://arxiv.org/abs/2308.12563v3 ) ライセンス: Link先を確認	Thi Kieu Khanh Ho, Narges Armanfard,	(参考訳) 主流の教師なし異常検出アルゴリズムは、しばしば学術データセットで優れているが、クリーンなトレーニングデータを含む制御された実験条件のため、実際の性能は制限されている。ノイズによるトレーニングの課題に対処するためには,現実的な異常検出の課題として,しばしば見落とされがちである。先駆的な試みとして,感覚時系列異常検出(TSAD)におけるラベルレベルのノイズの領域について検討した。本稿では,トレーニングデータを異常で汚染した場合に,新しいかつ実用的な非教師付きTSADを提案する。 TSAD-Cと呼ばれるアプローチでは、トレーニングフェーズ中に異常ラベルにアクセスできない。 TSAD-Cは、トレーニング中に発生する異常(いわゆるノイズ)を修正できるデコンタミネータ、純粋な正規データのサロゲートと見なされるデコンタミネートデータ内の長期的な内部および変数間の依存関係をキャプチャするロングレンジ可変依存性モデリングモジュール、あらゆるタイプの異常を検出するアノマリー・スコーリングモジュールの3つのコアモジュールを含んでいる。 TSAD-Cが既存の手法を超越し,TSAD分野における新たな最先端技術を確立したことを,信頼性と多種多様な4つのデータセットで実証した。 Mainstream unsupervised anomaly detection algorithms often excel in academic datasets, yet their real-world performance is restricted due to the controlled experimental conditions involving clean training data. Addressing the challenge of training with noise, a prevalent issue in practical anomaly detection, is frequently overlooked. In a pioneering endeavor, this study delves into the realm of label-level noise within sensory time-series anomaly detection (TSAD). This paper presents a novel and practical end-to-end unsupervised TSAD when the training data is contaminated with anomalies. The introduced approach, called TSAD-C, is devoid of access to abnormality labels during the training phase. TSAD-C encompasses three core modules: a Decontaminator to rectify anomalies (aka noise) present during training, a Long-range Variable Dependency Modeling module to capture long-term intra- and inter-variable dependencies within the decontaminated data that is considered as a surrogate of the pure normal data, and an Anomaly Scoring module to detect anomalies from all types. Our extensive experiments conducted on four reliable and diverse datasets conclusively demonstrate that TSAD-C surpasses existing methodologies, thus establishing a new state-of-the-art in the TSAD field.	公開日:2024-09-26 翻訳日:2024-11-09 14:40:04
# EECS学生のためのハンズオン量子プログラミング研究室 Hands-on Quantum Programming Labs for EECS Students ( http://arxiv.org/abs/2308.14002v5 ) ライセンス: Link先を確認	Janche Sang, Chansu Yu,	(参考訳) 本報告では,電子工学と計算機科学(EECS)の学生に,専用のプログラムラボを通じて量子コンピューティングを教える実践的なアプローチを提案する。実験室は様々なトピックをカバーしており、絡み合い、量子ゲート、回路、量子鍵分布、DeutschとDeutsch-Jozsaアルゴリズム、Simonのアルゴリズム、Groverのアルゴリズムといった先進的なアルゴリズムを含む。教育者として、現場にいる仲間のインストラクターと教えの洞察とリソースを共有することを目的としている。興味のあるインストラクターには、完全なラボハンドアウトとプログラムテンプレートが提供される。さらに、このレポートは、それぞれの実験の設計の背後にある理論的根拠を解明し、量子コンピューティングのより深い理解を可能にする。 This report presents a practical approach to teaching quantum computing to Electrical Engineering & Computer Science (EECS) students through dedicated hands-on programming labs. The labs cover a diverse range of topics, encompassing fundamental elements, such as entanglement, quantum gates and circuits, as well as advanced algorithms including Quantum Key Distribution, Deutsch and Deutsch-Jozsa Algorithms, Simon's algorithm, and Grover's algorithm. As educators, we aim to share our teaching insights and resources with fellow instructors in the field. The full lab handouts and program templates are provided for interested instructors. Furthermore, the report elucidates the rationale behind the design of each experiment, enabling a deeper understanding of quantum computing.	公開日:2024-09-23 翻訳日:2024-11-09 14:40:04
# LLM in the Shell: Generative Honeypots LLM in the Shell: Generative Honeypots ( http://arxiv.org/abs/2309.00155v3 ) ライセンス: Link先を確認	Muris Sladić, Veronica Valeros, Carlos Catania, Sebastian Garcia,	(参考訳) ハニーポットはサイバーセキュリティにおいて、早期発見、脅威情報収集、攻撃者の行動分析に不可欠なツールである。しかし、そのほとんどは、人間の攻撃者を長期にわたって巻き込み、騙すために必要な現実主義を欠いている。ミツバチの区別が簡単であることは、その効果を強く妨げている。これは、決定論的すぎること、適応性の欠如、深みの欠如によって起こりうる。この研究は、Linuxライクなシェル出力を生成するLarge Language Modelsをベースとした、動的で現実的なソフトウェアハニーポットであるShelLMを導入している。我々はクラウドベースのLLMを用いてShelLMを設計・実装した。我々は,ShelLMが実Linuxシェルから期待通りに出力を生成できるかどうかを評価した。この評価は、サイバーセキュリティ研究者にハニーポットの使用を依頼し、ハニーポットからの回答がLinuxシェルから期待されているものであればフィードバックする。以上の結果から,ShelLMは現在のハニーポットの限界に対処できる信頼性と動的回答を創出できることが示唆された。 ShelLM は TNR 0.90 に達し、実際の Linux シェルと整合性があることを人間に納得させた。実験を複製するソースコードとプロンプトが公開されている。 Honeypots are essential tools in cybersecurity for early detection, threat intelligence gathering, and analysis of attacker's behavior. However, most of them lack the required realism to engage and fool human attackers long-term. Being easy to distinguish honeypots strongly hinders their effectiveness. This can happen because they are too deterministic, lack adaptability, or lack deepness. This work introduces shelLM, a dynamic and realistic software honeypot based on Large Language Models that generates Linux-like shell output. We designed and implemented shelLM using cloud-based LLMs. We evaluated if shelLM can generate output as expected from a real Linux shell. The evaluation was done by asking cybersecurity researchers to use the honeypot and give feedback if each answer from the honeypot was the expected one from a Linux shell. Results indicate that shelLM can create credible and dynamic answers capable of addressing the limitations of current honeypots. ShelLM reached a TNR of 0.90, convincing humans it was consistent with a real Linux shell. The source code and prompts for replicating the experiments have been publicly available.	公開日:2024-09-23 翻訳日:2024-11-09 14:40:04
# 古典的到着時間のモーダル変形 Moyal deformation of the classical arrival time ( http://arxiv.org/abs/2309.00222v4 ) ライセンス: Link先を確認	Dean Alvin L. Pablico, Eric A. Galapon,	(参考訳) 到着の量子時間(TOA)問題は、粒子の初期状態のみを仮定して測定された到着時間の統計を必要とする。量子論の標準的な枠組みに従って、この問題は古典的到着時刻 $\mathcal{T}_C(q,p)$ の適切な量子像を見つけることに変換される。本稿では、量子力学の位相空間定式化における問題を新たに考察する。得られた量子画像は実数値で時間反転対称関数 $\mathcal{T}_M(q,p)$ の形式的級数$\hbar^2$ であり、古典的到着時刻を主項とする。これはハミルトニアン系とのモヤルブラケット関係から直接得られ、したがって古典的TOAのモヤル変形として解釈される。その性質について検討し、$\mathcal{T}_M(q,p)$ と[Eur で構築されたヒルベルト空間 TOA 作用素の間の同型性を示すことによって、既知の障害物を量子化にバイパスする方法について議論する。 Phys J. Plus \textbf{138}, 153 (2023)] は任意の解析ポテンシャルに対して常に時間-エネルギーの正準交換関係(TECCR)を満たす。次に、自由粒子と準振動子ポテンシャルのTOA問題を例として考察する。 The quantum time of arrival (TOA) problem requires the statistics of measured arrival times given only the initial state of a particle. Following the standard framework of quantum theory, the problem translates into finding an appropriate quantum image of the classical arrival time $\mathcal{T}_C(q,p)$, usually in operator form $\hat{\mathrm{T}}$. In this paper, we consider the problem anew within the phase space formulation of quantum mechanics. The resulting quantum image is a real-valued and time-reversal symmetric function $\mathcal{T}_M(q,p)$ in formal series of $\hbar^2$ with the classical arrival time as the leading term. It is obtained directly from the Moyal bracket relation with the system Hamiltonian and is hence interpreted as a Moyal deformation of the classical TOA. We investigate its properties and discuss how it bypasses the known obstructions to quantization by showing the isomorphism between $\mathcal{T}_M(q,p)$ and the rigged Hilbert space TOA operator constructed in [Eur. Phys. J. Plus \textbf{138}, 153 (2023)] which always satisfy the time-energy canonical commutation relation (TECCR) for arbitrary analytic potentials. We then examine TOA problems for a free particle and a quartic oscillator potential as examples.	公開日:2024-09-27 翻訳日:2024-11-09 14:40:04
# ブリザード2023チャレンジにおけるフルートシェルフランスの合成システム The FruitShell French synthesis system at the Blizzard 2023 Challenge ( http://arxiv.org/abs/2309.00223v3 ) ライセンス: Link先を確認	Xin Qi, Xiaopeng Wang, Zhiyong Wang, Wang Liu, Mingming Ding, Shuchen Shi,	(参考訳) 本稿では,Blizzard Challenge 2023のためのフランス語音声合成システムを提案する。この課題は、女性話者から高品質な音声を生成することと、特定の個人によく似た音声を生成することの2つのタスクから構成される。競合データについては,欠落したテキストデータや誤テキストデータを除去するスクリーニング処理を行った。音素以外のすべての記号を整理し,発音や持続時間を持たない記号を除去した。さらに、テキストに単語境界と開始/終了記号を追加し、過去の経験を基にした音声品質の向上を図った。 Spokeタスクでは,競合ルールに従ってデータ拡張を行った。我々は、オープンソースのG2Pモデルを使用して、フランス語のテキストを音素に書き起こした。 G2PモデルはIPA(International Phonetic Alphabet)を用いており、提案した競合データに同じ書き起こし処理を適用して標準化した。しかし、IPAチャートから特殊記号を認識する際のコンパイラの制限により、全ての音素を競合データに使用する音素に変換する規則に従った。最後に,全競合音声を均一サンプリングレート16kHzに再サンプリングした。ハイフィガンボコーダを用いたVITSを用いた音響モデルを用いた。 Spokeタスクでは,複数話者モデルを訓練し,モデルの持続時間予測器,ボコーダ,フロー層に話者情報を組み込んだ。システム評価の結果,Hubタスクが3.6,Spokeタスクが3.4,システムの平均レベルが全参加チーム中の平均値となった。 This paper presents a French text-to-speech synthesis system for the Blizzard Challenge 2023. The challenge consists of two tasks: generating high-quality speech from female speakers and generating speech that closely resembles specific individuals. Regarding the competition data, we conducted a screening process to remove missing or erroneous text data. We organized all symbols except for phonemes and eliminated symbols that had no pronunciation or zero duration. Additionally, we added word boundary and start/end symbols to the text, which we have found to improve speech quality based on our previous experience. For the Spoke task, we performed data augmentation according to the competition rules. We used an open-source G2P model to transcribe the French texts into phonemes. As the G2P model uses the International Phonetic Alphabet (IPA), we applied the same transcription process to the provided competition data for standardization. However, due to compiler limitations in recognizing special symbols from the IPA chart, we followed the rules to convert all phonemes into the phonetic scheme used in the competition data. Finally, we resampled all competition audio to a uniform sampling rate of 16 kHz. We employed a VITS-based acoustic model with the hifigan vocoder. For the Spoke task, we trained a multi-speaker model and incorporated speaker information into the duration predictor, vocoder, and flow layers of the model. The evaluation results of our system showed a quality MOS score of 3.6 for the Hub task and 3.4 for the Spoke task, placing our system at an average level among all participating teams.	公開日:2024-09-25 翻訳日:2024-11-09 14:40:04
# 置換不変エンコーダとより厳密な変動目標を用いた多モード生成モデルの学習 Learning multi-modal generative models with permutation-invariant encoders and tighter variational objectives ( http://arxiv.org/abs/2309.00380v3 ) ライセンス: Link先を確認	Marcel Hirt, Domenico Campolo, Victoria Leong, Juan-Pablo Ortega,	(参考訳) マルチモーダルデータに対する深い潜伏変数モデルの開発は、機械学習研究において長年のテーマであった。マルチモーダル変分オートエンコーダ(VAE)は、複数のモーダルを共同で説明する潜在表現を学習する一般的な生成モデルクラスである。このようなモデルに対する様々な目的関数が提案され、しばしばマルチモーダルデータ対数や情報理論的な考察から下界として動機付けられる。異なるモダリティ部分集合から潜在変数を符号化するために、Product-of-Experts(PoE)またはMixture-of-Experts(MoE)アグリゲーションスキームが日常的に使われ、例えば、複数のモダリティにわたる生成品質や一貫性に関して、異なるトレードオフをもたらすことが示されている。本研究では,データログ類似度を厳密に近似できる変動目標について考察する。我々は、置換不変ニューラルネットワークに基づく異なるモーダル性から符号化された特徴を組み合わせることにより、PoEやMoEアプローチの帰納バイアスを回避する、より柔軟なアグリゲーションスキームを開発する。数値解析実験では、多モード変動目的と様々なアグリゲーションスキームのトレードオフについて述べる。同定可能なモデルにおいて、観測されたモジュラリティと潜伏変数の真の関節分布を近似したい場合、我々の変動目的およびより柔軟な凝集モデルが有益であることが示される。 Devising deep latent variable models for multi-modal data has been a long-standing theme in machine learning research. Multi-modal Variational Autoencoders (VAEs) have been a popular generative model class that learns latent representations that jointly explain multiple modalities. Various objective functions for such models have been suggested, often motivated as lower bounds on the multi-modal data log-likelihood or from information-theoretic considerations. To encode latent variables from different modality subsets, Product-of-Experts (PoE) or Mixture-of-Experts (MoE) aggregation schemes have been routinely used and shown to yield different trade-offs, for instance, regarding their generative quality or consistency across multiple modalities. In this work, we consider a variational objective that can tightly approximate the data log-likelihood. We develop more flexible aggregation schemes that avoid the inductive biases in PoE or MoE approaches by combining encoded features from different modalities based on permutation-invariant neural networks. Our numerical experiments illustrate trade-offs for multi-modal variational objectives and various aggregation schemes. We show that our variational objective and more flexible aggregation models can become beneficial when one wants to approximate the true joint distribution over observed modalities and latent variables in identifiable models.	公開日:2024-09-24 翻訳日:2024-11-09 14:40:04
# ロバストオンライン分類:見積もりからデノイングへ Robust Online Classification: From Estimation to Denoising ( http://arxiv.org/abs/2309.01698v2 ) ライセンス: Link先を確認	Changlong Wu, Ananth Grama, Wojciech Szpankowski,	(参考訳) 一般仮説クラスを用いて,特徴のオンライン分類をラベルに分類する。我々の設定では、真のラベルは仮説クラス内の何らかの関数によって決定されるが、未知の確率ノイズによって破損し、その特徴は逆向きに生成される。観測されたノイズラベルとノイズレス特徴を用いて予測を行い、真のラベルと比較した場合の最小リスクを用いて性能を計測する。ノイズ機構は、個々のデータポイントに対して、実際のノイズラベル分布が選択された分布のセットを指定する一般的なノイズカーネルを介してモデル化される。提案手法は,カーネルが誘導するノイズラベル分布のHellingerギャップによって(仮説クラスサイズの対数係数まで)極小リスクを強く特徴付け,ノイズの手段や分散といった他の特性に依存しないことを示す。本手法は,オンライン設定に適したLe Cam-Birg\'eテストの条件付きバージョンとともに,2つの仮説のオンライン比較スキームへの新規な削減に基づく。本研究は,一般の雑音観測に対処しながら,基礎的真理を保証した,ノイズの多いオンライン分類の包括的特徴を初めて提供する。 We study online classification of features into labels with general hypothesis classes. In our setting, true labels are determined by some function within the hypothesis class but are corrupted by unknown stochastic noise, and the features are generated adversarially. Predictions are made using observed noisy labels and noiseless features, while the performance is measured via minimax risk when comparing against true labels. The noise mechanism is modeled via a general noise kernel that specifies, for any individual data point, a set of distributions from which the actual noisy label distribution is chosen. We show that minimax risk is tightly characterized (up to a logarithmic factor of the hypothesis class size) by the Hellinger gap of the noisy label distributions induced by the kernel, independent of other properties such as the means and variances of the noise. Our main technique is based on a novel reduction to an online comparison scheme of two hypotheses, along with a new conditional version of Le Cam-Birg\'e testing suitable for online settings. Our work provides the first comprehensive characterization for noisy online classification with guarantees with respect to the ground truth while addressing general noisy observations.	公開日:2024-09-25 翻訳日:2024-11-09 14:40:04
# 逐次ボリューム設計課題のための表現学習 Representation Learning for Sequential Volumetric Design Tasks ( http://arxiv.org/abs/2309.02583v2 ) ライセンス: Link先を確認	Md Ferdous Alam, Yi Wang, Chin-Yi Cheng, Jieliang Luo,	(参考訳) ボリュームデザイン(英: volumetric design)は、マスキングデザインとも呼ばれる、プロの建築設計における最初の重要なステップであり、本質的にはシーケンシャルである。ボリューム設計プロセスは慎重な設計決定と反復的な調整を必要とするため、基礎となるシーケンシャル設計プロセスはデザイナーにとって貴重な情報をエンコードする。合理的なボリューム設計を自動生成するための多くの努力がなされているが、生成した設計ソリューションの品質は様々であり、設計ソリューションを評価するには、極めて包括的なメトリクスセットか、高価な人間の専門知識が必要である。従来,設計課題ではなく最終設計の学習に焦点をあてたアプローチでは,設計知識を専門家や高性能な設計シーケンスの集合から符号化し,トランスフォーマーモデルを用いて有用な表現を抽出することを提案した。後日、設計選好評価や手続き設計生成といった重要な下流アプリケーションにおいて、学習した表現を活用することを提案する。本研究では,学習した表現の密度を推定して嗜好モデルを開発する一方で,逐次設計生成のための自己回帰変換モデルを訓練する。数千のシーケンシャルなボリュームデザインの新たなデータセットを活用することで、私たちのアイデアを実証する。我々の選好モデルは、任意に与えられた2つの設計シーケンスを比較することができ、ランダムな設計シーケンスに対する評価において約90\%の精度を持つ。我々の自己回帰モデルは、部分設計シーケンスからボリューム設計シーケンスを自動補完することも可能である。 Volumetric design, also called massing design, is the first and critical step in professional building design which is sequential in nature. As the volumetric design process requires careful design decisions and iterative adjustments, the underlying sequential design process encodes valuable information for designers. Many efforts have been made to automatically generate reasonable volumetric designs, but the quality of the generated design solutions varies, and evaluating a design solution requires either a prohibitively comprehensive set of metrics or expensive human expertise. While previous approaches focused on learning only the final design instead of sequential design tasks, we propose to encode the design knowledge from a collection of expert or high-performing design sequences and extract useful representations using transformer-based models. Later we propose to utilize the learned representations for crucial downstream applications such as design preference evaluation and procedural design generation. We develop the preference model by estimating the density of the learned representations whereas we train an autoregressive transformer model for sequential design generation. We demonstrate our ideas by leveraging a novel dataset of thousands of sequential volumetric designs. Our preference model can compare two arbitrarily given design sequences and is almost $90\%$ accurate in evaluation against random design sequences. Our autoregressive model is also capable of autocompleting a volumetric design sequence from a partial design sequence.	公開日:2024-09-24 翻訳日:2024-11-09 14:40:04
# 窒素空孔電子スピン欠陥の制御可能性限界の定量化 Quantifying the limits of controllability for the nitrogen-vacancy electron spin defect ( http://arxiv.org/abs/2309.03120v2 ) ライセンス: Link先を確認	Paul Kairys, Jonathan C. Marcks, Nazar Delegan, Jiefei Zhang, David D. Awschalom, F. Joseph Heremans,	(参考訳) ダイヤモンドの窒素空孔中心のような固体電子スピン量子ビットは、感度を高めデバイスコヒーレンスを改善するために、集団反転の制御配列に依存している。しかし、このパラダイムシステムでさえ、集団反転の基本的な限界と量子センシングのような応用に対する潜在的な影響は定量的に評価されていない。ここでは、隣り合う核スピンの明示的なユニタリシミュレーションを含む、回転波近似を超えた高精度なシミュレーションを行う。量子最適制御を用いて、スピン-1基底状態内の量子ビット部分空間の制御のための解析パルスを同定し、パルス複雑性、制御時間、忠実度の関係を定量化する。制御期間を短縮した振幅と帯域幅の要求を指数関数的に増加させ,さらにサブナノ秒集団インバージョンを用いたマルチパルス列に対する非マルコフ効果の出現を定量化する。このことから、還元された忠実度と非マルコフ性は、電子スピンと核スピン環境とのコヒーレントな相互作用に起因すると判定する。最終的には、高忠実度多重パルス列に対するナノ秒制御の潜在的実現可能な機構を同定する。これらの結果は、ダイヤモンドの電子スピン欠陥を用いた量子情報処理の基本的な限界に関する重要な洞察を与える。 Solid-state electron spin qubits, like the nitrogen-vacancy center in diamond, rely on control sequences of population inversion to enhance sensitivity and improve device coherence. But even for this paradigmatic system, the fundamental limits of population inversion and potential impacts on applications like quantum sensing have not been assessed quantitatively. Here, we perform high accuracy simulations beyond the rotating wave approximation, including explicit unitary simulation of neighboring nuclear spins. Using quantum optimal control, we identify analytical pulses for the control of a qubit subspace within the spin-1 ground state and quantify the relationship between pulse complexity, control duration, and fidelity. We find exponentially increasing amplitude and bandwidth requirements with reduced control duration and further quantify the emergence of non-Markovian effects for multipulse sequences using sub-nanosecond population inversion. From this, we determine that the reduced fidelity and non-Markovianity is due to coherent interactions of the electron spin with the nuclear spin environment. Ultimately, we identify a potentially realizable regime of nanosecond control duration for high-fidelity multipulse sequences. These results provide key insights into the fundamental limits of quantum information processing using electron spin defects in diamond.	公開日:2024-09-24 翻訳日:2024-11-09 14:40:04
# 地中真実の生成:ソフトラベルとラベルノイズ研究のための合成データ Generating the Ground Truth: Synthetic Data for Soft Label and Label Noise Research ( http://arxiv.org/abs/2309.04318v2 ) ライセンス: Link先を確認	Sjoerd de Vries, Dirk Thierens,	(参考訳) 多くの実世界の分類タスクにおいて、ラベルノイズは機械学習モデルの一般化誤差に悪影響を及ぼす避けられない問題である。また, クリーンなラベルを使わずに, ラベルノイズが性能に与える影響を正確に定量化できないため, このようなノイズの処理方法の評価は困難である。ラベルノイズに関する既存の研究は、通常、ノイズまたは単純化されたシミュレーションデータをベースラインとして依存し、既知の特性を持つ追加ノイズを注入する。本稿では,これらの制約に対処するためのフレームワークであるSynLABELを紹介する。 SynLABELは、事前指定または学習された関数を基底真理関数として定義することをサポートし、新しいクリーンラベルの生成に使用できる。さらに、関数の領域内で選択された特徴の値を繰り返し再サンプリングし、関数を評価し、その結果のラベルを集約することにより、各データポイントにソフトラベルまたはラベル分布を割り当てることができる。これらの分布は多くの実世界のデータセットに存在する固有の不確実性を捉え、ラベルノイズの直接注入と定量化を可能にする。生成されたデータセットは、さまざまな種類のノイズを導入可能な、調整可能な複雑性のクリーンなベースラインとして機能する。さらに、ソフトラベル学習と関連する応用の研究を促進する。我々はSynLABELの応用を実演し、ラベルノイズを正確に定量化し、既存の手法よりも改善したことを示す。 In many real-world classification tasks, label noise is an unavoidable issue that adversely affects the generalization error of machine learning models. Additionally, evaluating how methods handle such noise is complicated, as the effect label noise has on their performance cannot be accurately quantified without clean labels. Existing research on label noise typically relies on either noisy or oversimplified simulated data as a baseline, into which additional noise with known properties is injected. In this paper, we introduce SYNLABEL, a framework designed to address these limitations by creating noiseless datasets informed by real-world data. SYNLABEL supports defining a pre-specified or learned function as the ground truth function, which can then be used for generating new clean labels. Furthermore, by repeatedly resampling values for selected features within the domain of the function, evaluating the function and aggregating the resulting labels, each data point can be assigned a soft label or label distribution. These distributions capture the inherent uncertainty present in many real-world datasets and enable the direct injection and quantification of label noise. The generated datasets serve as a clean baseline of adjustable complexity, into which various types of noise can be introduced. Additionally, they facilitate research into soft label learning and related applications. We demonstrate the application of SYNLABEL, showcasing its ability to precisely quantify label noise and its improvement over existing methodologies.	公開日:2024-09-23 翻訳日:2024-11-09 14:40:04
# 量子コンピュータのためのリアルタイム・スケーラブル・高速・高資源なデコーダ A real-time, scalable, fast and highly resource efficient decoder for a quantum computer ( http://arxiv.org/abs/2309.05558v2 ) ライセンス: Link先を確認	Ben Barber, Kenton M. Barnes, Tomasz Bialas, Okan Buğdaycı, Earl T. Campbell, Neil I. Gillespie, Kauser Johar, Ram Rajan, Adam W. Richardson, Luka Skoric, Canberk Topal, Mark L. Turner, Abbas B. Ziad,	(参考訳) 量子コンピュータの可能性を解き放つためには、量子ビットの性能に対するノイズ効果を慎重に管理する必要がある。ノイズによって引き起こされる計算エラーを診断するデコーダは、大きな量子ビット数へのスケーリングと低温動作を可能にするために、リソースを効率的に利用しなければならない。さらに、量子コンピュータの論理クロックレートが指数関数的に遅くなるのを避けるために、速度で動作する必要がある。このような課題を克服するために、Collision Clusteringデコーダを導入し、FPGAおよびASICハードウェア上で実装する。量子誤り訂正方式, 表面符号を用いて論理記憶実験をシミュレーションし, 超伝導量子ビットなどの高速動作モードの要求に合致するMHz復号速度をFPGAとASICでそれぞれ851および1057キュービット表面コードに近似した。 ASIC の設計は 0.06 mm$^2$ であり、わずか 8 mW の電力しか消費しない。我々のデコーダは高い性能とリソース効率を持ち、フォールトトレラントな量子コンピュータを実現するための実行可能な道を開く。 To unleash the potential of quantum computers, noise effects on qubits' performance must be carefully managed. The decoders responsible for diagnosing noise-induced computational errors must use resources efficiently to enable scaling to large qubit counts and cryogenic operation. Additionally, they must operate at speed, to avoid an exponential slowdown in the logical clock rate of the quantum computer. To overcome such challenges, we introduce the Collision Clustering decoder and implement it on FPGA and ASIC hardware. We simulate logical memory experiments using the leading quantum error correction scheme, the surface code, and demonstrate MHz decoding speed - matching the requirements of fast-operating modalities such as superconducting qubits - up to an 881 and 1057 qubits surface code with the FPGA and ASIC, respectively. The ASIC design occupies 0.06 mm$^2$ and consumes only 8 mW of power. Our decoder is both highly performant and resource efficient, unlocking a viable path to practically realising fault-tolerant quantum computers.	公開日:2024-09-24 翻訳日:2024-11-09 14:28:50
# 短絡-断熱による高忠実度マクロ微視的重ね合わせ状態 High fidelity macroscopic superposition states via shortcut to adiabaticity ( http://arxiv.org/abs/2309.06031v2 ) ライセンス: Link先を確認	Mehdi Aslani, Vahid Salari, Mehdi Abdi,	(参考訳) 巨視的空間重畳状態の大規模物体を調製するために, 断熱方式のショートカットを提案する。本稿では, トラップ電位をパラボラから二重井戸に調整しながら, 即時ハミルトニアンの基底状態におけるシステム維持に反断熱駆動を用いることを提案する。これは、制御パラメータを適切に傾斜させて行われる。いくつかの反断熱ドライブは、ほとんどのケースで十分であることを示す。この実装のために超伝導回路のハイブリッド電気機械構成を提案する。本手法の効率は,ノイズや不完全性の存在下でのシステムの力学を数値的に解くことで評価される。その結果,高忠実度で空間的に識別可能な猫状態を持つ機械共振器をプロトコルを用いて作成できることが示唆された。さらに、このプロトコルはノイズや不完全性に対して堅牢である。また、結合回路電気力学キャビティモードの分光による最終状態の検証手法についても検討する。我々の研究は、将来の実験において、マクロな重ね合わせ状態を実現し、検証するための基礎研究として役立てることができる。 A shortcut to an adiabatic scheme is proposed for preparing a massive object in a macroscopic spatial superposition state. In this scheme we propose to employ counterdiabatic driving to maintain the system in the ground state of its instantaneous Hamiltonian while the trap potential is tuned from a parabola to a double well. This, in turn, is performed by properly ramping a control parameter. We show that a few counterdiabatic drives are enough for most practical cases. A hybrid electromechanical setup in superconducting circuits is proposed for the implementation. The efficiency of our scheme is benchmarked by numerically solving the system dynamics in the presence of noises and imperfections. The results show that a mechanical resonator with very-high-fidelity spatially distinguishable cat states can be prepared with our protocol. Furthermore, the protocol is robust against noises and imperfections. We also discuss a method for verifying the final state via spectroscopy of a coupled circuit electrodynamical cavity mode. Our work can serve as the ground work to feasibly realize and verify macroscopic superposition states in future experiments.	公開日:2024-09-26 翻訳日:2024-11-09 14:28:50
# 大規模言語モデルにおけるRe-Readingの改善 Re-Reading Improves Reasoning in Large Language Models ( http://arxiv.org/abs/2309.06275v3 ) ライセンス: Link先を確認	Xiaohan Xu, Chongyang Tao, Tao Shen, Can Xu, Hongbo Xu, Guodong Long, Jian-guang Lou, Shuai Ma,	(参考訳) 既成のLarge Language Models (LLMs) の推論能力を高めるために, 単純で汎用的で効果的なプロンプト手法であるRe2を導入する。出力の推論プロセスを引き出すことを目的としたChain-of-Thought (CoT) のような、ほとんどの思考上の促進方法とは異なり、Re2 は質問を2回処理することで、入力に焦点を移し、理解プロセスを強化する。その結果、Re2 は CoT を含むほとんどの思考依存的プロンプト手法との強い一般化と互換性を示す。重要なことに、Re2は、第1パスが第2パスのグローバル情報を提供するため、一方向デコーダのみのLLMで"双方向"エンコーディングを容易にする。まず、Re2の基礎となる予備的な実証研究から始め、その「双方向」注意機構の実現の可能性を示す。その後、14のデータセットにわたる広範囲な推論ベンチマークでRe2を評価し、112の実験にまたがって、その有効性と汎用性を検証する。以上の結果から,バニラChatGPTではいくつかのシナリオを除いて,Re2は単純な再読解戦略によってLCMの推論性能を一貫して向上させることがわかった。さらなる分析により、Re2の適応性を明らかにし、異なるLLMと効果的に統合する方法、思考の緩和、アンサンブル戦略を示す。私たちのコードは \url{https://github.com/Tebmer/reading-LLM-Reasoning/} で利用可能です。 To enhance the reasoning capabilities of off-the-shelf Large Language Models (LLMs), we introduce a simple, yet general and effective prompting method, Re2, i.e., \textbf{Re}-\textbf{Re}ading the question as input. Unlike most thought-eliciting prompting methods, such as Chain-of-Thought (CoT), which aim to elicit the reasoning process in the output, Re2 shifts the focus to the input by processing questions twice, thereby enhancing the understanding process. Consequently, Re2 demonstrates strong generality and compatibility with most thought-eliciting prompting methods, including CoT. Crucially, Re2 facilitates a "bidirectional" encoding in unidirectional decoder-only LLMs because the first pass could provide global information for the second pass. We begin with a preliminary empirical study as the foundation of Re2, illustrating its potential to enable "bidirectional" attention mechanisms. We then evaluate Re2 on extensive reasoning benchmarks across 14 datasets, spanning 112 experiments, to validate its effectiveness and generality. Our findings indicate that, with the exception of a few scenarios on vanilla ChatGPT, Re2 consistently enhances the reasoning performance of LLMs through a simple re-reading strategy. Further analyses reveal Re2's adaptability, showing how it can be effectively integrated with different LLMs, thought-eliciting prompting, and ensemble strategies. Our code is available at \url{https://github.com/Tebmer/Rereading-LLM-Reasoning/}	公開日:2024-09-21 翻訳日:2024-11-09 14:28:50
# (ほぼ)量子ベルの不等式とデバイス非依存の応用 (Almost-)Quantum Bell Inequalities and Device-Independent Applications ( http://arxiv.org/abs/2309.06304v4 ) ライセンス: Link先を確認	Yuan Liu, Ho Yiu Chung, Ravishankar Ramanathan,	(参考訳) 近年、量子ベルの不等式の導出による量子相関の境界に関する調査が注目されているが、これはツィレルソンの問題と関連しており、DI情報処理に重要な応用がある。しかし、量子ベルの不等式を決定することは、非常に難しい課題であり、孤立した例のみが知られている。本稿では、(ほぼ)量子ベルの不等式(英語版)のファミリーを提示し、3つの基礎的およびDI的応用に焦点を当てる。第一に、符号なし境界上の量子相関は弱い源からのDIランダム性抽出において重要である。 2つのkアウトカム測定を持つ2人のプレイヤーの現実的なベルシナリオでは、量子ベルの不等式を導出し、4k-8の非符号境界の特定の部分から量子境界を分離し、前の結果を拡張する。直近の副産物として、量子系に対するオーマンの合意定理とほぼ量子相関の一般的な証明を与える。これは、オーマンの合意定理が、一般的な非符号理論から量子理論とほぼ量子相関の両方を選ぶための、疫学の文脈における合理的な物理原理であることを意味する。第二に、m二乗測定シナリオを持つ2人のプレイヤーに量子ベルの不等式(英語版)の族を提示し、2量子ビットのシングルレットと2mの測定を自己検証する。興味深いことに、この主張はTsirelson-Landau-Masanesによって発見された m=2 の結果を一般化し、最先端の DIRA よりも改善されたことを示す。最後に、量子ベルの不等式を用いて、量子相関集合を特徴づける情報理論の原理である非局所計算における優位性の原理の一般形を導出する。これにより、これまでに知られている量子境界の最も正確な特徴を与える。 Investigations of the boundary of the quantum correlation set through the derivation of quantum Bell inequalities have gained increased attention in recent years, which are related to Tsirelson's problem and have significant applications in DI information processing. However, determining quantum Bell inequalities is a notoriously difficult task and only isolated examples are known. In this paper, we present families of (almost-)quantum Bell inequalities and highlight three foundational and DI applications. Firstly, quantum correlations on the non-signaling boundary are crucial in the DI randomness extraction from weak sources. In the practical Bell scenario of two players with two k-outcome measurements, we derive quantum Bell inequalities that show a separation of the quantum boundary from certain portions of the no-signaling boundary of dimension up to 4k-8, extending previous results. As an immediate by-product of this, we give a general proof of Aumann's Agreement theorem for quantum systems and the almost-quantum correlations, which implies Aumann's agreement theorem is a reasonable physical principle in the context of epistemics to pick out both quantum theory and almost-quantum correlations from general no-signaling theories. Secondly, we present a family of quantum Bell inequalities in the two players with m binary measurements scenarios, that serve to self-test the two-qubit singlet and 2m measurements. Interestingly, this claim generalizes the result for m=2 discovered by Tsirelson-Landau-Masanes and shows an improvement over the state-of-the-art DIRA. Lastly, we use our quantum Bell inequalities to derive the general form of the principle of no advantage in nonlocal computation, which is an information-theoretic principle that serves to characterize the quantum correlation set. With this, we provide the most precise characterization of the quantum boundary known so far.	公開日:2024-09-27 翻訳日:2024-11-09 14:28:50
# $\texttt{NePhi}$: およそ拡散型医用画像登録のためのニューラルな変形場 $\texttt{NePhi}$: Neural Deformation Fields for Approximately Diffeomorphic Medical Image Registration ( http://arxiv.org/abs/2309.07322v3 ) ライセンス: Link先を確認	Lin Tian, Hastings Greer, Raúl San José Estépar, Roni Sengupta, Marc Niethammer,	(参考訳) この研究は、およそ微分同相変換をもたらす一般化可能なニューラル変形モデルNePhiを提案する。学習ベースの登録アプローチで使用される主要なボクセルベースの変換フィールドとは対照的に、NePhiは変形を関数的に表現し、トレーニングや推論、推論時間、登録精度、変換規則性といったメモリ消費の設計空間において大きな柔軟性をもたらす。具体的には、NePhi 1) ボクセルベースの学習手法に比べてメモリ消費は少ない。 2) 既存のニューラル変形に基づく登録手法が最適化に依存しているのに対して,潜時符号の予測により推論速度が向上する。 3)インスタンス最適化による精度の向上,および 4) 医用画像登録に好適な変形規則性を示した。実際の3次元医用画像データセット(肺や脳など)と同様に,2次元合成データセット上でのNePhiの性能を実証する。以上の結果から,NePhiは単一解像度の登録設定において,ボクセルに基づく表現の精度に適合できることがわかった。マルチレゾリューション登録では、現在のSOTA学習に基づく登録手法とインスタンス最適化の精度を一致させ、メモリ要求を5倍に削減する。私たちのコードはhttps://github.com/uncbiag/NePhi.comで公開されています。 This work proposes NePhi, a generalizable neural deformation model which results in approximately diffeomorphic transformations. In contrast to the predominant voxel-based transformation fields used in learning-based registration approaches, NePhi represents deformations functionally, leading to great flexibility within the design space of memory consumption during training and inference, inference time, registration accuracy, as well as transformation regularity. Specifically, NePhi 1) requires less memory compared to voxel-based learning approaches, 2) improves inference speed by predicting latent codes, compared to current existing neural deformation based registration approaches that \emph{only} rely on optimization, 3) improves accuracy via instance optimization, and 4) shows excellent deformation regularity which is highly desirable for medical image registration. We demonstrate the performance of NePhi on a 2D synthetic dataset as well as for real 3D medical image datasets (e.g., lungs and brains). Our results show that NePhi can match the accuracy of voxel-based representations in a single-resolution registration setting. For multi-resolution registration, our method matches the accuracy of current SOTA learning-based registration approaches with instance optimization while reducing memory requirements by a factor of five. Our code is available at https://github.com/uncbiag/NePhi.	公開日:2024-09-27 翻訳日:2024-11-09 14:28:50
# C-Pack:中国の一般的な埋め込みのためのパッケージ化リソース C-Pack: Packed Resources For General Chinese Embeddings ( http://arxiv.org/abs/2309.07597v5 ) ライセンス: Link先を確認	Shitao Xiao, Zheng Liu, Peitian Zhang, Niklas Muennighoff, Defu Lian, Jian-Yun Nie,	(参考訳) C-Packは、一般的な中国の埋め込みの分野を著しく前進させるリソースのパッケージである。 C-Packには3つの重要なリソースが含まれている。 1) C-MTEBは6つのタスクと35のデータセットをカバーする中国語テキスト埋め込みの総合ベンチマークである。 2) C-MTPは, ラベル付き, ラベルなしの中国語コーパスを用いて, 埋め込みモデルを訓練するための大量のテキスト埋め込みデータセットである。 3) C-TEMは、複数のサイズをカバーする埋め込みモデルのファミリーである。弊社のモデルは、C-MTEB上の以前の中国語のテキスト埋め込みを、リリース時に最大で10%上回っている。また、C-TEMのための一連のトレーニング方法を統合し、最適化します。一般的な中国語の埋め込みに関するリソースに加えて、英語のテキスト埋め込みのためのデータとモデルもリリースしています。 MTEBベンチマークでは、英語モデルは最先端のパフォーマンスを達成していますが、我々のリリースした英語データは、中国のデータより2倍も大きいのです。これらのリソースはすべてhttps://github.com/FlagOpen/FlagEmbedding.comで公開されています。 We introduce C-Pack, a package of resources that significantly advance the field of general Chinese embeddings. C-Pack includes three critical resources. 1) C-MTEB is a comprehensive benchmark for Chinese text embeddings covering 6 tasks and 35 datasets. 2) C-MTP is a massive text embedding dataset curated from labeled and unlabeled Chinese corpora for training embedding models. 3) C-TEM is a family of embedding models covering multiple sizes. Our models outperform all prior Chinese text embeddings on C-MTEB by up to +10% upon the time of the release. We also integrate and optimize the entire suite of training methods for C-TEM. Along with our resources on general Chinese embedding, we release our data and models for English text embeddings. The English models achieve state-of-the-art performance on MTEB benchmark; meanwhile, our released English data is 2 times larger than the Chinese data. All these resources are made publicly available at https://github.com/FlagOpen/FlagEmbedding.	公開日:2024-09-24 翻訳日:2024-11-09 14:28:50
# Spectrum-Aware Debiasing - 主要コンポーネントの回帰処理を応用した現代的な推論フレームワーク Spectrum-Aware Debiasing: A Modern Inference Framework with Applications to Principal Components Regression ( http://arxiv.org/abs/2309.07810v4 ) ライセンス: Link先を確認	Yufan Li, Pragya Sur,	(参考訳) 偏見は高次元統計学における基本的な概念である。自由度調整は、高次元線形回帰における最先端技術である一方、これはi.d.サンプルと亜ガウス共変量に限られる。これらの制約は、その広範な実用性を妨げている。本稿では,高次元回帰のための新しい手法であるSpectrum-Aware Debiasingを紹介する。我々のアプローチは、構造化された依存関係、重いテール、低ランク構造に関する問題に適用されます。提案手法は, サンプル共分散行列のスペクトル情報を用いて再スケーリング係数を導出し, 再スケール勾配降下ステップによるデバイアス化を実現する。スペクトルベースのアプローチは、より広い文脈での正確な偏りの除去を可能にする。特徴量とサンプル数が比例的にスケールする共通近代体制を考察する。我々は、共変量体が右回転不変であるとき、様々な収束概念の下で、提案した推定器の漸近正規性(好適に中心化およびスケール化)を確立する。このような設計は、圧縮センシングにおいて重要な役割を担っているため、近年注目を集めている。さらに、その漸近的分散に対する一貫した推定器を考案する。まず、主成分回帰(PCR)のバイアスを補正するためにSpectrum-Aware Debiasingを使用し、高次元における最初の脱バイアスPCR推定器を提供する。第2に、サンプル共分散行列の信号と固有ベクトルとの整合性を確認するための原理的テストを導入する。このテストは、近似メッセージパッシング(英語版)、Leave-one-out(英語版)、凸ガウスのmin-max定理(英語版)を用いて開発された統計手法には独立に有用である。シミュレーションおよび実データ実験により本手法を実証する。技術的には、近似メッセージパッシングアルゴリズムとデバイアスを結合し、ベクトル近似メッセージパッシング(V-AMP)のコーシー性の最初の証明を提供する。 Debiasing is a fundamental concept in high-dimensional statistics. While degrees-of-freedom adjustment is the state-of-the-art technique in high-dimensional linear regression, it is limited to i.i.d. samples and sub-Gaussian covariates. These constraints hinder its broader practical use. Here, we introduce Spectrum-Aware Debiasing--a novel method for high-dimensional regression. Our approach applies to problems with structured dependencies, heavy tails, and low-rank structures. Our method achieves debiasing through a rescaled gradient descent step, deriving the rescaling factor using spectral information of the sample covariance matrix. The spectrum-based approach enables accurate debiasing in much broader contexts. We study the common modern regime where the number of features and samples scale proportionally. We establish asymptotic normality of our proposed estimator (suitably centered and scaled) under various convergence notions when the covariates are right-rotationally invariant. Such designs have garnered recent attention due to their crucial role in compressed sensing. Furthermore, we devise a consistent estimator for its asymptotic variance. Our work has two notable by-products: first, we use Spectrum-Aware Debiasing to correct bias in principal components regression (PCR), providing the first debiased PCR estimator in high dimensions. Second, we introduce a principled test for checking alignment between the signal and the eigenvectors of the sample covariance matrix. This test is independently valuable for statistical methods developed using approximate message passing, leave-one-out, or convex Gaussian min-max theorems. We demonstrate our method through simulated and real data experiments. Technically, we connect approximate message passing algorithms with debiasing and provide the first proof of the Cauchy property of vector approximate message passing (V-AMP).	公開日:2024-10-04 翻訳日:2024-11-09 14:28:50
# 量子干渉による重力相互作用ダークマターの検出 Detecting Gravitationally Interacting Dark Matter with Quantum Interference ( http://arxiv.org/abs/2309.08238v3 ) ライセンス: Link先を確認	Alejandro Perez, Carlo Rovelli, Marios Christodoulou,	(参考訳) ダークマターの存在を示す大きな天文学的な証拠にもかかわらず、ダークマターの性質は謎のままである。特に量子重力の基本的なスケールであるプランク質量周辺の質量と相互作用する粒子は、興味深い候補となっている。ここでは、高感度重力による量子位相シフトを用いて、そのような粒子を直接検出する理論的可能性を示す。特に、ジョセフソン接合を利用したプロトコルを考える。 In spite or the large astronomical evidence for its existence, the nature of dark matter remains enigmatic. Particles that interact only, or almost only, gravitationally, in particular with masses around the Planck mass -- the fundamental scale in quantum gravity, are intriguing candidates. Here we show that there is a theoretical possibility to directly detect such particles using highly sensitive gravity-mediated quantum phase shifts. In particular, we consider a protocol utilizing Josephson junctions.	公開日:2024-09-27 翻訳日:2024-11-09 14:28:50
# フルオロベンゼン中の電子ウェーブレットのイオン化と励起によるアトケミカル量子干渉のシグナル Signature of attochemical quantum interference upon ionization and excitation of an electronic wavepacket in fluoro-benzene ( http://arxiv.org/abs/2309.08269v3 ) ライセンス: Link先を確認	Anthony Ferté, Dane Austin, Allan S. Johnson, Felicity McGrath, João Pedro Malhado, Jon P. Marangos, Morgane Vacher,	(参考訳) ウルトラショートパルスは分子を励起またはイオン化し、コヒーレントな電子ウェーブパケットを凝集させ、複雑なダイナミクスを引き起こす。本研究では, ベンゼンとフッ化ベンゼン分子の異なる電子波束へのイオン化に伴う結合電子核動力学を, 量子力学的および全次元でシミュレートする。フルオロベンゼンでは、計算は状態間および状態内量子干渉の両方を解き、アトケミカルの明確なシグネチャと自己相関関数の形状における電荷方向のダイナミクスを残せる。後者はベンゼンとフルオロベンゼンの実験的な高調波分光測定と一致している。 Ultrashort pulses can excite or ionize molecules and populate coherent electronic wavepackets, inducing complex dynamics. In this work, we simulate the coupled electron-nuclear dynamics upon ionization to different electronic wavepackets of (deuterated) benzene and fluoro-benzene molecules, quantum mechanically and in full dimensionality. In fluoro-benzene, the calculations unravel both inter-state and intra-state quantum interferences that leave clear signatures of attochemistry and charge-directed dynamics in the shape of the autocorrelation function. The latter are in agreement with experimental high harmonic spectroscopy measurements of benzenes and fluoro-benzene.	公開日:2024-09-23 翻訳日:2024-11-09 14:28:50
# YCB-Ev 1.1:6DoFオブジェクトポーズ推定のためのイベントビジョンデータセット YCB-Ev 1.1: Event-vision dataset for 6DoF object pose estimation ( http://arxiv.org/abs/2309.08482v2 ) ライセンス: Link先を確認	Pavel Rojtberg, Thomas Pöllabauer,	(参考訳) 本研究は,同期RGB-Dフレームとイベントデータを含むYCB-Evデータセットを導入し,これらのモダリティを用いた6DoFオブジェクトポーズ推定アルゴリズムの評価を可能にする。このデータセットは、YCB-Video(YCB-V)データセットで使用されたのと同じ21のYCBオブジェクトに対して、6DoFオブジェクトのポーズを提供する。データセットは21の同期イベントとRGB-Dシーケンスで構成され、合計で13,851フレーム(7分43秒)である。特に、これらのシーケンスのうち12は、BOPチャレンジで使用されるYCB-Vサブセットと同じオブジェクト配列である。地中真実のポーズは、RGB-Dフレーム内のオブジェクトを検出し、イベントタイムスタンプに合わせるためにポーズを補間し、外的キャリブレーションを用いてイベント座標フレームに転送することで生成される。私たちのデータセットは、イベントストリームに6DoFのポーズデータを提供する最初のものです。さらに,新しいYCB-Vシークエンスを用いて,BOPチャレンジのために事前学習された2つの最先端アルゴリズムの一般化能力を評価する。データセットはhttps://github.com/paroj/ycbev.comで公開されている。 Our work introduces the YCB-Ev dataset, which contains synchronized RGB-D frames and event data that enables evaluating 6DoF object pose estimation algorithms using these modalities. This dataset provides ground truth 6DoF object poses for the same 21 YCB objects that were used in the YCB-Video (YCB-V) dataset, allowing for cross-dataset algorithm performance evaluation. The dataset consists of 21 synchronized event and RGB-D sequences, totalling 13,851 frames (7 minutes and 43 seconds of event data). Notably, 12 of these sequences feature the same object arrangement as the YCB-V subset used in the BOP challenge. Ground truth poses are generated by detecting objects in the RGB-D frames, interpolating the poses to align with the event timestamps, and then transferring them to the event coordinate frame using extrinsic calibration. Our dataset is the first to provide ground truth 6DoF pose data for event streams. Furthermore, we evaluate the generalization capabilities of two state-of-the-art algorithms, which were pre-trained for the BOP challenge, using our novel YCB-V sequences. The dataset is publicly available at https://github.com/paroj/ycbev.	公開日:2024-09-25 翻訳日:2024-11-09 14:28:50
# 量子擬似ランダムスクランブラ Quantum Pseudorandom Scramblers ( http://arxiv.org/abs/2309.08941v2 ) ライセンス: Link先を確認	Chuhan Lu, Minglong Qin, Fang Song, Penghui Yao, Mingnan Zhao,	(参考訳) 量子擬似ランダム状態発生器(PRSG)は近年、エキサイティングな発展を促している。固定初期(例えば全ゼロ)状態のPSRGは、Haarランダム状態と計算的に区別できない出力状態を生成する。しかし、出力状態の擬似ランダム性は他の初期状態では保証されない。実際、既知のPSSG構造はいくつかの初期状態で確実に失敗する。本研究では、任意の初期状態上で擬似乱数状態を生成する量子擬似乱数状態スクランブラ(PRSS)を提案し、構築する。情報理論的な設定では、任意の初期状態を全変動距離におけるハールランダムに近い量子状態の分布にマッピングするスクランブラを得る。その結果,スクランブラーは分散特性を示した。一般には、状態空間の$\epsilon$-netにまたがることができる。このことは、平均出力状態がハールランダム状態に近似するならば、状態空間の小さな領域のみに集中できるため、標準PSRGが誘導できるものを大幅に強化する。我々のPRSS構造は有名なKacの歩行を平行に拡張し、標準のKacの歩行よりも指数関数的に高速に混合することを示す。これは我々の証明の核となる。 PRSSの応用についても述べる。 PRSSの構成は、量子後片道関数を仮定するが、PRSSはより弱いプリミティブであり、標準PSSGと同様の相対化世界の片道関数から分離することができる。 Quantum pseudorandom state generators (PRSGs) have stimulated exciting developments in recent years. A PRSG, on a fixed initial (e.g., all-zero) state, produces an output state that is computationally indistinguishable from a Haar random state. However, pseudorandomness of the output state is not guaranteed on other initial states. In fact, known PRSG constructions provably fail on some initial states. In this work, we propose and construct quantum Pseudorandom State Scramblers (PRSSs), which can produce a pseudorandom state on an arbitrary initial state. In the information-theoretical setting, we obtain a scrambler which maps an arbitrary initial state to a distribution of quantum states that is close to Haar random in total variation distance. As a result, our scrambler exhibits a dispersing property. Loosely, it can span an $\epsilon$-net of the state space. This significantly strengthens what standard PRSGs can induce, as they may only concentrate on a small region of the state space provided that average output state approximates a Haar random state. Our PRSS construction develops a parallel extension of the famous Kac's walk, and we show that it mixes exponentially faster than the standard Kac's walk. This constitutes the core of our proof. We also describe a few applications of PRSSs. While our PRSS construction assumes a post-quantum one-way function, PRSSs are potentially a weaker primitive and can be separated from one-way functions in a relativized world similar to standard PRSGs.	公開日:2024-09-22 翻訳日:2024-11-09 14:28:50
# OWL:IT運用のための大規模言語モデル OWL: A Large Language Model for IT Operations ( http://arxiv.org/abs/2309.09298v2 ) ライセンス: Link先を確認	Hongcheng Guo, Jian Yang, Jiaheng Liu, Liqun Yang, Linzheng Chai, Jiaqi Bai, Junran Peng, Xiaorong Hu, Chao Chen, Dongfeng Zhang, Xu Shi, Tieqiao Zheng, Liangfan Zheng, Bo Zhang, Ke Xu, Zhoujun Li,	(参考訳) IT運用の急速な発展に伴い、実用的なアプリケーションのために大量のデータを効率的に管理し、分析することがますます重要になっている。自然言語処理(NLP)の技術は、名前付きエンティティ認識、機械翻訳、対話システムなど、様々なタスクに顕著な能力を示している。最近、Large Language Models (LLM) は様々なNLPダウンストリームタスクで大幅に改善されている。しかし、IT運用には特殊なLLMが欠如している。本稿では,収集したOWL-Instructデータセットに基づいて学習した大規模言語モデルOWLを紹介する。さらに、当社が確立したOWL-Bench上でのOWLの性能を評価し、IT関連ベンチマークをオープンにする。 OWLはITタスクにおける優れたパフォーマンス結果を示しており、既存のモデルをかなり上回っている。さらに、私たちの研究の成果が、専門的なLLMでIT運用の技術に革命をもたらすことを願っています。 With the rapid development of IT operations, it has become increasingly crucial to efficiently manage and analyze large volumes of data for practical applications. The techniques of Natural Language Processing (NLP) have shown remarkable capabilities for various tasks, including named entity recognition, machine translation and dialogue systems. Recently, Large Language Models (LLMs) have achieved significant improvements across various NLP downstream tasks. However, there is a lack of specialized LLMs for IT operations. In this paper, we introduce the OWL, a large language model trained on our collected OWL-Instruct dataset with a wide range of IT-related information, where the mixture-of-adapter strategy is proposed to improve the parameter-efficient tuning across different domains or tasks. Furthermore, we evaluate the performance of our OWL on the OWL-Bench established by us and open IT-related benchmarks. OWL demonstrates superior performance results on IT tasks, which outperforms existing models by significant margins. Moreover, we hope that the findings of our work will provide more insights to revolutionize the techniques of IT operations with specialized LLMs.	公開日:2024-09-27 翻訳日:2024-11-09 14:28:50
# 古典的あるいは量子二項最適化を用いた任意の線形方程式系を解く反復アルゴリズムの収束性の改善 Improving the convergence of an iterative algorithm for solving arbitrary linear equation systems using classical or quantum binary optimization ( http://arxiv.org/abs/2309.09933v3 ) ライセンス: Link先を確認	Erick R. Castro, Eldues O. Martins, Roberto S. Sarthour, Alexandre M. Souza, Ivan S. Oliveira,	(参考訳) 量子コンピューティングと量子に触発されたアルゴリズムの最近の進歩は、バイナリ最適化に新たな関心を喚起している。これらのハードウェアとソフトウェア革新は、複雑な問題に対するソリューションタイムに革命をもたらすことを約束する。本研究では,線形システムの解法を提案する。提案手法は二項最適化を利用しており,特に条件数の多い問題に適している。線形系を二進最適化問題に変換し、元の問題の幾何学からインスピレーションを得て、共役勾配法に類似する。このアプローチでは、アルゴリズムの収束率を著しく加速する共役方向を用いる。さらに本研究では,問題の内在的幾何の部分的知識を活用することにより,元の問題をより小さく独立したサブプロブレムに分解できることを実証する。これらのサブプロブレムは量子または古典的な解法を用いて効率的に取り組める。問題の幾何を決定することは計算コストの増大をもたらすが、この投資は既存の手法に比べてかなりの性能向上に勝っている。 Recent advancements in quantum computing and quantum-inspired algorithms have sparked renewed interest in binary optimization. These hardware and software innovations promise to revolutionize solution times for complex problems. In this work, we propose a novel method for solving linear systems. Our approach leverages binary optimization, making it particularly well-suited for problems with large condition numbers. We transform the linear system into a binary optimization problem, drawing inspiration from the geometry of the original problem and resembling the conjugate gradient method. This approach employs conjugate directions that significantly accelerate the algorithm's convergence rate. Furthermore, we demonstrate that by leveraging partial knowledge of the problem's intrinsic geometry, we can decompose the original problem into smaller, independent sub-problems. These sub-problems can be efficiently tackled using either quantum or classical solvers. While determining the problem's geometry introduces some additional computational cost, this investment is outweighed by the substantial performance gains compared to existing methods.	公開日:2024-09-27 翻訳日:2024-11-09 14:28:50
# XY相互作用による省エネルギー量子回路の合成 Synthesis of Energy-Conserving Quantum Circuits with XY interaction ( http://arxiv.org/abs/2309.11051v3 ) ライセンス: Link先を確認	Ge Bai, Iman Marvian,	(参考訳) 我々は、$\sqrt{iSWAP}$ゲートとより一般的には、XX+YY相互作用だけで実現できるエンタングルゲートから構築された量子回路について研究する。このようなゲートは計算ベースで状態のハミング重みを保ち、これはz軸周りの回転に対応する大域的U(1)対称性を尊重することを意味する。同様に、系内の各キュービットの内在的ハミルトニアンがパウリZ作用素であると仮定すると、系全体のエネルギーは保存される。我々は,z軸まわりの単一ビット回転の有無にかかわらず,XX+YY相互作用を用いて所望のエネルギー保存ユニタリを実現する回路を効率的に合成する方法を開発した。興味深いことに、CCZやFredkinゲートのような一般的なエネルギー保存単位を2つの局所的なエネルギー保存ゲートで実装するには、アンシラ量子ビットを使用する必要がある。 z軸周りの1量子回転が許されるとき、我々のスキームは1つのアンシラ量子ビットしか必要としないが、XX+YY相互作用だけでは2つのアンシラ量子ビットを必要とする。正確な実現に加えて、近似現実化についても検討し、$\sqrt{iSWAP}$ gates と 2 個の補助量子ビットの列のみを用いて一般エネルギー保存ユニタリをいかに合成できるかを示し、ソロヴィ・キタエフの定理を通じて有界な小さな誤差を持つ。我々の方法は、XX+YY相互作用ではなく、ハイゼンベルク交換相互作用のような計算ベースでは対角的でない他のエネルギー保存2体相互作用にアクセスできる場合、エネルギー保存ユニタリの合成にも応用できる。量子コンピューティング、量子熱力学、量子時計の文脈におけるこれらの回路の応用について簡単に論じる。 We study quantum circuits constructed from $\sqrt{iSWAP}$ gates and, more generally, from the entangling gates that can be realized with the XX+YY interaction alone. Such gates preserve the Hamming weight of states in the computational basis, which means they respect the global U(1) symmetry corresponding to rotations around the z axis. Equivalently, assuming that the intrinsic Hamiltonian of each qubit in the system is the Pauli Z operator, they conserve the total energy of the system. We develop efficient methods for synthesizing circuits realizing any desired energy-conserving unitary using XX+YY interaction with or without single-qubit rotations around the z-axis. Interestingly, implementing generic energy-conserving unitaries, such as CCZ and Fredkin gates, with 2-local energy-conserving gates requires the use of ancilla qubits. When single-qubit rotations around the z-axis are permitted, our scheme requires only a single ancilla qubit, whereas with the XX+YY interaction alone, it requires 2 ancilla qubits. In addition to exact realizations, we also consider approximate realizations and show how a general energy-conserving unitary can be synthesized using only a sequence of $\sqrt{iSWAP}$ gates and 2 ancillary qubits, with arbitrarily small error, which can be bounded via the Solovay-Kitaev theorem. Our methods are also applicable for synthesizing energy-conserving unitaries when, rather than the XX+YY interaction, one has access to any other energy-conserving 2-body interaction that is not diagonal in the computational basis, such as the Heisenberg exchange interaction. We briefly discuss the applications of these circuits in the context of quantum computing, quantum thermodynamics, and quantum clocks.	公開日:2024-09-23 翻訳日:2024-11-09 14:28:50
# EPTQ: Hessian-Guided Network-wise Optimization による学習後量子化の強化 EPTQ: Enhanced Post-Training Quantization via Hessian-guided Network-wise Optimization ( http://arxiv.org/abs/2309.11531v2 ) ライセンス: Link先を確認	Ofir Gordon, Elad Cohen, Hai Victor Habi, Arnon Netzer,	(参考訳) 量子化は、メモリと計算リソースが限られているエッジデバイスにディープニューラルネットワークをデプロイするための重要な方法である。ポストトレーニング量子化法(PTQ)の最近の改良は、重み量子化ラウンドリングポリシーを学習するための局所最適化プロセスによって達成された。しかし、小さな代表データセットでネットワークワイズ最適化を採用する場合、ギャップが存在する。本稿では,ネットワークワイド量子化最適化プロセスを利用するEPTQ(Advanced PTQ)の新たな手法を提案する。 EPTQは,ラベルフリーなヘッセン行列上界に基づく新しいサンプル層アテンションスコアを用いた,小さな代表データセットによるネットワークワイズ最適化を実現する。ラベルのない手法はPTQ方式に適合する。以上の境界について理論的解析を行い、それを用いて、より繊細な層やサンプルに焦点を合わせるよう最適化する知識蒸留損失を構築する。さらに,重みテンソルの高感度要素に着目し,重み量子化パラメータの選択を改善するためにヘッセン上界を利用する。 EPTQを用いることで、ImageNet分類、COCOオブジェクト検出、意味的セグメンテーションのためのPascal-VOCなど、さまざまなモデル、タスク、データセットの最先端結果が得られる。 Quantization is a key method for deploying deep neural networks on edge devices with limited memory and computation resources. Recent improvements in Post-Training Quantization (PTQ) methods were achieved by an additional local optimization process for learning the weight quantization rounding policy. However, a gap exists when employing network-wise optimization with small representative datasets. In this paper, we propose a new method for enhanced PTQ (EPTQ) that employs a network-wise quantization optimization process, which benefits from considering cross-layer dependencies during optimization. EPTQ enables network-wise optimization with a small representative dataset using a novel sample-layer attention score based on a label-free Hessian matrix upper bound. The label-free approach makes our method suitable for the PTQ scheme. We give a theoretical analysis for the said bound and use it to construct a knowledge distillation loss that guides the optimization to focus on the more sensitive layers and samples. In addition, we leverage the Hessian upper bound to improve the weight quantization parameters selection by focusing on the more sensitive elements in the weight tensors. Empirically, by employing EPTQ we achieve state-of-the-art results on various models, tasks, and datasets, including ImageNet classification, COCO object detection, and Pascal-VOC for semantic segmentation.	公開日:2024-09-26 翻訳日:2024-11-09 14:28:50
# 局所周期駆動を用いた光学格子の個別可変トンネル係数 Individually tunable tunnelling coefficients in optical lattices using local periodic driving ( http://arxiv.org/abs/2309.12124v2 ) ライセンス: Link先を確認	Georgia M. Nixon, F. Nur Unal, Ulrich Schneider,	(参考訳) 光格子中の超低温原子は、翻訳不変系の強力な量子シミュレータとして登場し、eg \ 強相関系および位相系に多くの応用がある。しかしながら、すべてのハミルトンパラメータを局所的にチューニングする能力は、より広い範囲の量子現象のシミュレーションを可能にする、優れた目標のままである。量子ガス顕微鏡と光ツイーザの最近の進歩により、光格子内の個々のトンネルリンクに対する局所的な制御は、局所的な時間周期ポテンシャルを組み込むことで、どのように達成できるかを理論的に示す。本研究では,各格子の現場エネルギーを周期的に変調し,Floquet理論を用いて1次元のトンネル振幅を個別に制御する方法を実証する。興味あるトポロジモデル(例えば拡張Su-Schrieffer-Heegerモデル)を実現するための様々な例を提供する。 2次元に拡張すると、リーブ格子の局所周期運転は、完全に制御可能なトンネル等級を持つ2次元ネットワークを設計する。 3サイト・プラケットでは, 相対的なトンネル振幅とゲージ不変フラックスを同時に同時に制御し, 完全にプログラム可能な2次元強結合モデルを構築するための明確なステップストーンを提供する。また、2次元の磁場勾配を生成するために、我々の技術をどのように活用するかを明確に示す。この局所変調スキームは多くの異なる格子幾何学に適用できる。 Ultracold atoms in optical lattices have emerged as powerful quantum simulators of translationally invariant systems with many applications in e.g.\ strongly-correlated and topological systems. However, the ability to locally tune all Hamiltonian parameters remains an outstanding goal that would enable the simulation of a wider range of quantum phenomena. Motivated by recent advances in quantum gas microscopes and optical tweezers, we here show theoretically how local control over individual tunnelling links in an optical lattice can be achieved by incorporating local time-periodic potentials. We propose to periodically modulate the on-site energy of individual lattice sites and employ Floquet theory to demonstrate how this provides full individual control over the tunnelling amplitudes in one dimension. We provide various example configurations realising interesting topological models such as extended Su-Schrieffer-Heeger models that would be challenging to realise by other means. Extending to two dimensions, we demonstrate that local periodic driving in a Lieb lattice engineers a 2D network with fully controllable tunnelling magnitudes. In a three-site plaquette, we show full simultaneous control over the relative tunnelling amplitudes and the gauge-invariant flux piercing the plaquette, providing a clear stepping stone to building a fully programmable 2D tight-binding model. We also explicitly demonstrate how utilise our technique to generate a magnetic field gradient in 2D. This local modulation scheme is applicable to many different lattice geometries.	公開日:2024-09-24 翻訳日:2024-11-09 14:28:50
# ウィグナーの友情シナリオと非古典的因果適合性, モノガミー関係, 微調整との関係 Relating Wigner's Friend scenarios to Nonclassical Causal Compatibility, Monogamy Relations, and Fine Tuning ( http://arxiv.org/abs/2309.12987v3 ) ライセンス: Link先を確認	Yìlè Yīng, Marina Maciel Ansanelli, Andrea Di Biagio, Elie Wolfe, Eric Gama Cavalcanti,	(参考訳) 非古典的因果モデリングは、相対論的因果構造と忠実性に固執しつつ、ベルの不平等の違反を説明するために開発された。近年、ベルの定理より強いと見なせるノーゴー定理が導出され、ウィグナーの友人の思考実験であるローカルフレンドリー(LF)のノーゴー定理の拡張に基づいている。ここでは、LFのノーゴー定理は、非古典的あるいは循環的因果的説明が考慮されたとしても、因果的モデリングの分野において重大な課題をもたらすことを示す。我々はまず、統計的境界問題から生じる単ガミー関係の特別な場合として、LFノゴー定理の重要な要素の一つであるLF不等式をリキャストした。さらに,不等式を非古典的因果補間問題から生じる因果補間不等式として,よく動機付けられた因果補間仮定によって示唆される因果構造について再検討した。この因果構造からLF不等式が現れるのは、一般に確率論やさらにエキゾチックな理論のように、観測された事象の潜伏原因が量子後記述を許容する場合であってもである。さらに、非古典的因果モデルでは、No Fine-Tuning原則に違反することなくLF不平等の違反を説明できないことを証明している。最後に、循環因果モデルに訴えてもこれらの障害は克服できないことに留意し、因果モデリングフレームワークのさらなる拡張の可能性について論じる。 Nonclassical causal modeling was developed in order to explain violations of Bell inequalities while adhering to relativistic causal structure and faithfulness -- that is, avoiding fine-tuned causal explanations. Recently, a no-go theorem that can be viewed as being stronger than Bell's theorem has been derived, based on extensions of the Wigner's friend thought experiment: the Local Friendliness (LF) no-go theorem. Here we show that the LF no-go theorem poses formidable challenges for the field of causal modeling, even when nonclassical and/or cyclic causal explanations are considered. We first recast the LF inequalities, one of the key elements of the LF no-go theorem, as special cases of monogamy relations stemming from a statistical marginal problem. We then further recast LF inequalities as causal compatibility inequalities stemming from a nonclassical causal marginal problem, for a causal structure implied by well-motivated causal-metaphysical assumptions. We find that the LF inequalities emerge from this causal structure even when one allows the latent causes of observed events to admit post-quantum descriptions, such as in a generalized probabilistic theory or in an even more exotic theory. We further prove that no nonclassical causal model can explain violations of LF inequalities without violating the No Fine-Tuning principle. Finally, we note that these obstacles cannot be overcome even if one appeals to cyclic causal models, and we discuss potential directions for further extensions of the causal modeling framework.	公開日:2024-09-25 翻訳日:2024-11-09 14:28:50
# ウィグナーの友人シナリオと非古典的因果適合性, モノガミー関係, 微調整との関連性 Relating Wigner's Friend Scenarios to Nonclassical Causal Compatibility, Monogamy Relations, and Fine Tuning ( http://arxiv.org/abs/2309.12987v4 ) ライセンス: Link先を確認	Yìlè Yīng, Marina Maciel Ansanelli, Andrea Di Biagio, Elie Wolfe, David Schmid, Eric Gama Cavalcanti,	(参考訳) 非古典的因果モデリングは、相対論的因果構造と忠実性に固執しつつ、ベルの不平等の違反を説明するために開発された。近年、ベルの定理より強いと見なせるノーゴー定理が導出され、ウィグナーの友人の思考実験であるローカルフレンドリー(LF)のノーゴー定理の拡張に基づいている。ここでは、LFのノーゴー定理は、非古典的あるいは循環的因果的説明が考慮されたとしても、因果的モデリングの分野において重大な課題をもたらすことを示す。我々はまず、統計的境界問題から生じる単ガミー関係の特別な場合として、LFノゴー定理の重要な要素の一つであるLF不等式をリキャストした。さらに,不等式を非古典的因果補間問題から生じる因果補間不等式として,よく動機付けられた因果補間仮定によって示唆される因果構造について再検討した。この因果構造からLF不等式が現れるのは、一般に確率論やさらにエキゾチックな理論のように、観測された事象の潜伏原因が量子後記述を許容する場合であってもである。さらに、非古典的因果モデルでは、No Fine-Tuning原則に違反することなくLF不平等の違反を説明できないことを証明している。最後に、循環因果モデルに訴えてもこれらの障害は克服できないことに留意し、因果モデリングフレームワークのさらなる拡張の可能性について論じる。 Nonclassical causal modeling was developed in order to explain violations of Bell inequalities while adhering to relativistic causal structure and faithfulness -- that is, avoiding fine-tuned causal explanations. Recently, a no-go theorem that can be viewed as being stronger than Bell's theorem has been derived, based on extensions of the Wigner's friend thought experiment: the Local Friendliness (LF) no-go theorem. Here we show that the LF no-go theorem poses formidable challenges for the field of causal modeling, even when nonclassical and/or cyclic causal explanations are considered. We first recast the LF inequalities, one of the key elements of the LF no-go theorem, as special cases of monogamy relations stemming from a statistical marginal problem. We then further recast LF inequalities as causal compatibility inequalities stemming from a nonclassical causal marginal problem, for a causal structure implied by well-motivated causal-metaphysical assumptions. We find that the LF inequalities emerge from this causal structure even when one allows the latent causes of observed events to admit post-quantum descriptions, such as in a generalized probabilistic theory or in an even more exotic theory. We further prove that no nonclassical causal model can explain violations of LF inequalities without violating the No Fine-Tuning principle. Finally, we note that these obstacles cannot be overcome even if one appeals to cyclic causal models, and we discuss potential directions for further extensions of the causal modeling framework.	公開日:2024-09-25 翻訳日:2024-11-09 14:28:50
# アルゴリズム採用における公正性とバイアス--多分野調査 Fairness and Bias in Algorithmic Hiring: a Multidisciplinary Survey ( http://arxiv.org/abs/2309.13933v3 ) ライセンス: Link先を確認	Alessandro Fabris, Nina Baranowska, Matthew J. Dennis, David Graus, Philipp Hacker, Jorge Saldivar, Frederik Zuiderveen Borgesius, Asia J. Biega,	(参考訳) 雇用者は採用パイプライン全体を通してアルゴリズムによる雇用技術を採用しています。アルゴリズム的公正性は、高い利害関係と構造的不等式のため、この領域で特に適用できる。残念ながら、この分野のほとんどの研究は部分的な扱いを提供しており、しばしば2つの競合する物語によって制約される。アルゴリズムによる雇用のバイアスが減り、社会に利益をもたらすかどうか、そしてさらに重要なことは、信頼感の低下に対して、現在のローテクな代替手段は未解決のままだ。この多分野にわたる調査は、システム、バイアス、尺度、緩和戦略、データセット、およびアルゴリズム雇用と公正性の法的側面のバランスよく統合されたカバレッジを持つ実践者や研究者に向けられている。私たちの仕事は、現在の機会と制限を強調し、すべての利害関係者に対する共有メリットを保証するために、将来の作業に対する推奨を提供することによって、この技術のコンテキスト化された理解とガバナンスを支援します。 Employers are adopting algorithmic hiring technology throughout the recruitment pipeline. Algorithmic fairness is especially applicable in this domain due to its high stakes and structural inequalities. Unfortunately, most work in this space provides partial treatment, often constrained by two competing narratives, optimistically focused on replacing biased recruiter decisions or pessimistically pointing to the automation of discrimination. Whether, and more importantly what types of, algorithmic hiring can be less biased and more beneficial to society than low-tech alternatives currently remains unanswered, to the detriment of trustworthiness. This multidisciplinary survey caters to practitioners and researchers with a balanced and integrated coverage of systems, biases, measures, mitigation strategies, datasets, and legal aspects of algorithmic hiring and fairness. Our work supports a contextualized understanding and governance of this technology by highlighting current opportunities and limitations, providing recommendations for future work to ensure shared benefits for all stakeholders.	公開日:2024-09-24 翻訳日:2024-11-09 14:28:50
# 不可逆性としての誤差と外乱:統一定義、ウィグナー-アーナキ-ヤナーゼ理論および時間外相関器 Error and Disturbance as Irreversibility with Applications: Unified Definition, Wigner--Araki--Yanase Theorem and Out-of-Time-Order Correlator ( http://arxiv.org/abs/2309.14172v2 ) ライセンス: Link先を確認	Haruki Emori, Hiroyasu Tajima,	(参考訳) ハイゼンベルクの不確実性原理の提案以来、量子測定の誤りと乱れは量子物理学の基本的な概念となっている。量子物理学において物理量を定義する場合と同様に、これらの2つの概念を定義する単一の方法はなく、多くの独立した定義が与えられている。ここでは、量子過程における不可逆性の特別な場合として、誤差と乱れを定義する新しい定式化を確立する。この定式化により、確率的熱力学と量子情報理論における不可逆性の知識を量子測定の誤差と乱れに適用することができる。この強さを示すために、我々は3つの副産物を提供する: まず、既存の誤りと乱れの定式化を統一する。第二に、量的ウィグナー・アラキ・ヤナーゼ定理(保存法に基づく測定実施に関する普遍的な制限)を任意の定義やプロセスの誤りや乱れに拡張する。第三に、我々の定式化は、量子多体系における量子カオスの尺度であるアウト・オブ・タイム・オーダード・コレレータ(out-of-time-orderd-correlator)を、測定コンテキストと類似の不可逆性としてカバーし、その実験的評価方法を提供する。 Since the proposal of Heisenberg's uncertainty principle, error and disturbance of quantum measurements have been fundamental notions in quantum physics. As is often the case when defining physical quantities in quantum physics, there is no single way to define these two notions, and many independent definitions of them have been given. Here, we establish a novel formulation defining the error and disturbance as special cases of the irreversibility in quantum processes. The formulation enables us to apply the knowledge of irreversibility in stochastic thermodynamics and quantum information theory to the error and disturbance in quantum measurements. To demonstrate this strength, we provide three byproducts: First, we unify the existing formulations of error and disturbance. Second, we extend the quantitative Wigner--Araki--Yanase theorem -- a universal restriction on measurement implementation under a conservation law -- to errors and disturbances of arbitrary definitions and processes. Third, we reveal that our formulation covers the out-of-time-orderd-correlator -- a measure of quantum chaos in a quantum many-body system -- as the irreversibility in analogy with the measurement context, and provide its experimental evaluation method.	公開日:2024-09-27 翻訳日:2024-11-09 14:28:50
# Informative Manifold Projection を用いたクラスタ探索 Cluster Exploration using Informative Manifold Projections ( http://arxiv.org/abs/2309.14857v3 ) ライセンス: Link先を確認	Stavros Gerolymatos, Xenophon Evangelopoulos, Vladimir Gusev, John Y. Goulermas,	(参考訳) 次元性低減(DR)は、高次元データの視覚的な探索と、2次元または3次元空間におけるクラスタ構造を明らかにするための重要なツールの1つである。文献におけるDR手法の大部分は、実践者が検討中のデータセットに関する事前知識を考慮に入れていない。本稿では,従来の知識の異なる構造を抽出するだけでなく,その基盤となる構造を明らかにすることを目的とした,情報埋め込みを生成する新しい手法を提案する。これを実現するために,まず,先行情報に関連付けられた構造を縮小するコントラストPCAと,得られた埋め込みにおいて有意なデータ分離を保証するクルトーシス投影探索という2つの目的を線形に組み合わせた。本稿では,この課題を多様体最適化問題として定式化し,3種類の事前知識を考慮に入れた多種多様なデータセットを経験的に検証する。最後に,高次元データの反復的視覚探索を行うためのフレームワークを提供する。 Dimensionality reduction (DR) is one of the key tools for the visual exploration of high-dimensional data and uncovering its cluster structure in two- or three-dimensional spaces. The vast majority of DR methods in the literature do not take into account any prior knowledge a practitioner may have regarding the dataset under consideration. We propose a novel method to generate informative embeddings which not only factor out the structure associated with different kinds of prior knowledge but also aim to reveal any remaining underlying structure. To achieve this, we employ a linear combination of two objectives: firstly, contrastive PCA that discounts the structure associated with the prior information, and secondly, kurtosis projection pursuit which ensures meaningful data separation in the obtained embeddings. We formulate this task as a manifold optimization problem and validate it empirically across a variety of datasets considering three distinct types of prior knowledge. Lastly, we provide an automated framework to perform iterative visual exploration of high-dimensional data.	公開日:2024-09-27 翻訳日:2024-11-09 14:28:50
# Can-SAVE:生存分析変数とHRによる大量がんリスク予測 Can-SAVE: Mass Cancer Risk Prediction via Survival Analysis Variables and EHR ( http://arxiv.org/abs/2309.15039v2 ) ライセンス: Link先を確認	Petr Philonenko, Vladimir Kokh, Pavel Blinov,	(参考訳) 特定のがんスクリーニング法は、しばしば費用がかかり、時間がかかり、大規模に適用できる。高度な人工知能(AI)法は、がんの検出に大いに役立つが、特定のまたは深い医療データを必要とする。これらの側面は、がんスクリーニング法の大量実装を妨げる。そのため、既存のElectronic Health Records(EHR)ボリュームに基づいて、がんリスクの大量パーソナライズされた評価にAI手法を適用することは、医療にとって破壊的な変化である。本稿では,Can-SAVE癌リスク評価手法を提案する。アクセス性が高く、資源効率が良く、一連の高レベルの医療イベントのみを利用する。提案手法をロシア国内1100万人以上の住民と4つの地域を対象とした長期的ふりかえり実験で検証した。 Can-SAVE法は平均精度22.8%$\pm$2.7%対15.1%$\pm$2.6%の基準値を大きく上回る。広範囲にわたるアブレーション試験により,提案手法の優位性が確認された。腫瘍学者が監督する実験では、1000人中84人のがん患者が確実に検出されることが示された。これらの結果は, 経時的に要する年齢差が1000例中9例に留まっている(大腸癌の場合)。以上の結果から,従来の医療リスク評価手法に比べて癌検出率(TOP@1k)は4.7-6.4倍向上した。 Specific medical cancer screening methods are often costly, time-consuming, and weakly applicable on a large scale. Advanced Artificial Intelligence (AI) methods greatly help cancer detection but require specific or deep medical data. These aspects prevent the mass implementation of cancer screening methods. For this reason, it is a disruptive change for healthcare to apply AI methods for mass personalized assessment of the cancer risk among patients based on the existing Electronic Health Records (EHR) volume. This paper presents a novel Can-SAVE cancer risk assessment method combining a survival analysis approach with a gradient-boosting algorithm. It is highly accessible and resource-efficient, utilizing only a sequence of high-level medical events. We tested the proposed method in a long-term retrospective experiment covering more than 1.1 million people and four regions of Russia. The Can-SAVE method significantly exceeds the baselines by the Average Precision metric of 22.8%$\pm$2.7% vs 15.1%$\pm$2.6%. The extensive ablation study also confirmed the proposed method's dominant performance. The experiment supervised by oncologists shows a reliable cancer patient detection rate of up to 84 out of 1000 selected. Such results surpass the medical screening strategies estimates; the typical age-specific Number Needed to Screen is only 9 out of 1000 (for colorectal cancer). Overall, our experiments show a 4.7-6.4 times improvement in cancer detection rate (TOP@1k) compared to the traditional healthcare risk estimation approach.	公開日:2024-09-27 翻訳日:2024-11-09 10:12:15
# 機械学習のためのハミングウェイト保存量子回路の訓練性と表現性 Trainability and Expressivity of Hamming-Weight Preserving Quantum Circuits for Machine Learning ( http://arxiv.org/abs/2309.15547v2 ) ライセンス: Link先を確認	Léo Monbroussou, Eliott Z. Mamon, Jonas Landman, Alex B. Grilo, Romain Kukla, Elham Kashefi,	(参考訳) 量子機械学習(QML)は、量子コンピュータの現実的な応用にとって有望な分野となっているが、短期的手法とその拡張性は依然として重要な研究トピックである。この文脈では、変動量子回路(VQC)を保存した特定のハミング重みのトレーナビリティと制御性について分析する。これらの回路は、ヒルベルト空間の部分空間を保存するクォービットゲートを使用し、固定ハミング重み$k$の基底状態で区切られている。本研究では、まず、新しいヒューリスティックなデータローダの実現可能性を示し、$n$-qubit量子回路をトレーニングすることにより、$\binom{n}{k}$-dimensionalベクトルの量子振幅符号化を行う。これらのデータローダは、QFIM(Quantum Fisher Information Matrix)のランクをチェックし、次元削減技術を用いて得られる。第2に、任意の VQC 状態の QFIM のランクがほぼどこでも一定であり、これは別の関心事であるという事実を理論的に正当化する。最後に、ハミング重み保存回路のトレーニング可能性を分析し、その部分空間の次元$\binom{n}{k}$に応じて、$l_2$コスト関数勾配のばらつきが有界であることを示す。このことは、これらの回路に対するバレンプラトーの存在/欠如の条件を証明し、近年の制御可能性と変分量子回路のトレーニング可能性の関係に関する予想が適用されない状況を強調している。 Quantum machine learning (QML) has become a promising area for real world applications of quantum computers, but near-term methods and their scalability are still important research topics. In this context, we analyze the trainability and controllability of specific Hamming weight preserving variational quantum circuits (VQCs). These circuits use qubit gates that preserve subspaces of the Hilbert space, spanned by basis states with fixed Hamming weight $k$. In this work, we first design and prove the feasibility of new heuristic data loaders, performing quantum amplitude encoding of $\binom{n}{k}$-dimensional vectors by training an $n$-qubit quantum circuit. These data loaders are obtained using dimensionality reduction techniques, by checking the Quantum Fisher Information Matrix (QFIM)'s rank. Second, we provide a theoretical justification for the fact that the rank of the QFIM of any VQC state is almost-everywhere constant, which is of separate interest. Lastly, we analyze the trainability of Hamming weight preserving circuits, and show that the variance of the $l_2$ cost function gradient is bounded according to the dimension $\binom{n}{k}$ of the subspace. This proves conditions of existence/lack of Barren Plateaus for these circuits, and highlights a setting where a recent conjecture on the link between controllability and trainability of variational quantum circuits does not apply.	公開日:2024-09-26 翻訳日:2024-11-09 10:12:15
# ミューオン崩壊における相対論的絡み合い Relativistic entanglement in muon decay ( http://arxiv.org/abs/2309.15863v2 ) ライセンス: Link先を確認	S. Carneiro, F. C. Sobrinho,	(参考訳) 非折り畳み相互作用の存在下での量子絡みの時間進化について論じる。特に、磁場中におけるミューオン崩壊生成物の絡み合いを再考する。これは角運動量保存の結果であり、ブルックヘイブンとフェルミラブの実験によって報告されたものと正確な一致で測定されたミューオンg因子の異常をもたらす。 We discuss the time evolution of quantum entanglement in presence of non-collapsing interactions. In particular, the entanglement between the products of a muon decay in a magnetic field is revisited. It results from angular momentum conservation and leads to an anomaly in the measured muon g factor in precise agreement with that reported by the Brookhaven and Fermilab experiments.	公開日:2024-09-26 翻訳日:2024-11-09 10:12:15
# 支援を受けるための学習: 介入を意識した概念埋め込みモデル Learning to Receive Help: Intervention-Aware Concept Embedding Models ( http://arxiv.org/abs/2309.16928v3 ) ライセンス: Link先を確認	Mateo Espinosa Zarlenga, Katherine M. Collins, Krishnamurthy Dvijotham, Adrian Weller, Zohreh Shams, Mateja Jamnik,	(参考訳) 概念ボトルネックモデル (Concept Bottleneck Models, CBM) は、高レベルの概念セットを使用して予測を構築し、説明することによって、ニューラルネットワークの不透明さに対処する。これらのモデルの特別な特性は、ユーザーが誤予測された概念を修正でき、それによってモデルの性能が向上する、概念の介入を許すことである。しかし、最近の研究は、介入効果は概念が介入される順序やモデルのアーキテクチャやハイパーパラメーターの訓練に大きく依存することを示した。これは、モデルが概念的介入に適切に受容されるための、CBMの列車時のインセンティブの欠如に起因している、と我々は主張する。そこで我々は,新しいCBMアーキテクチャとトレーニングパラダイムであるIntervention-Aware Concept Embedding Model (IntCEMs)を提案する。我々のモデルは、列車の時間に意味のある介入経路をサンプリングできるエンド・ツー・エンド方式の概念介入ポリシーを学習する。この条件では、IntCEMは、テスト時にデプロイされたコンセプトの介入を効果的に選択し、受け取ります。実験の結果,IntCEMはテスト時間の概念介入を施す場合,最先端の概念解釈モデルよりも優れており,本手法の有効性が示された。 Concept Bottleneck Models (CBMs) tackle the opacity of neural architectures by constructing and explaining their predictions using a set of high-level concepts. A special property of these models is that they permit concept interventions, wherein users can correct mispredicted concepts and thus improve the model's performance. Recent work, however, has shown that intervention efficacy can be highly dependent on the order in which concepts are intervened on and on the model's architecture and training hyperparameters. We argue that this is rooted in a CBM's lack of train-time incentives for the model to be appropriately receptive to concept interventions. To address this, we propose Intervention-aware Concept Embedding models (IntCEMs), a novel CBM-based architecture and training paradigm that improves a model's receptiveness to test-time interventions. Our model learns a concept intervention policy in an end-to-end fashion from where it can sample meaningful intervention trajectories at train-time. This conditions IntCEMs to effectively select and receive concept interventions when deployed at test-time. Our experiments show that IntCEMs significantly outperform state-of-the-art concept-interpretable models when provided with test-time concept interventions, demonstrating the effectiveness of our approach.	公開日:2024-09-26 翻訳日:2024-11-09 10:12:15
# 評価指標としての大規模言語モデルにおける認知バイアスのベンチマーク Benchmarking Cognitive Biases in Large Language Models as Evaluators ( http://arxiv.org/abs/2309.17012v3 ) ライセンス: Link先を確認	Ryan Koo, Minhwa Lee, Vipul Raheja, Jong Inn Park, Zae Myung Kim, Dongyeop Kang,	(参考訳) 大規模言語モデルは認知的に偏見のある裁判官である。大規模言語モデル(LLM)は、最近、簡単なプロンプトと文脈内学習を備えた自動評価器として有効であることが示されている。本研究では,4つの異なるサイズ範囲の15個のLLMを組み立て,システムスターがシステムスクエアよりも優れているような評価器として,他のLLMからの優先順位付けによる出力応答の評価を行う。次に、LCM評価出力の6つの異なる認知バイアスを測定するベンチマークであるCoBBLEr(CoBBLEr)として、LCMの認知バイアスベンチマークを導入したランキングアウトプットの品質を評価する。 LLMはテキスト品質評価器であり、評価器としての頑健性に疑問を呈する評価のそれぞれにおいて、バイアスベンチマーク(すべてのモデルで比較される平均40%)に強い指標を示す。さらに,人間と機械の嗜好の相関について検討し,平均ランクバイアスオーバーラップ(RBO)スコアを49.6%と算出し,機械選好が人間と不一致であることを示唆した。以上の結果から,LLMは人間の嗜好に沿った自動アノテーションには利用できない可能性が示唆された。私たちのプロジェクトページは以下の通りです。 Large Language Models are cognitively biased judges. Large Language Models (LLMs) have recently been shown to be effective as automatic evaluators with simple prompting and in-context learning. In this work, we assemble 15 LLMs of four different size ranges and evaluate their output responses by preference ranking from the other LLMs as evaluators, such as System Star is better than System Square. We then evaluate the quality of ranking outputs introducing the Cognitive Bias Benchmark for LLMs as Evaluators (CoBBLEr), a benchmark to measure six different cognitive biases in LLM evaluation outputs, such as the Egocentric bias where a model prefers to rank its own outputs highly in evaluation. We find that LLMs are biased text quality evaluators, exhibiting strong indications on our bias benchmark (average of 40% of comparisons across all models) within each of their evaluations that question their robustness as evaluators. Furthermore, we examine the correlation between human and machine preferences and calculate the average Rank-Biased Overlap (RBO) score to be 49.6%, indicating that machine preferences are misaligned with humans. According to our findings, LLMs may still be unable to be utilized for automatic annotation aligned with human preferences. Our project page is at: https://minnesotanlp.github.io/cobbler.	公開日:2024-09-25 翻訳日:2024-11-09 10:12:15
# 9歳の子どもたちは感情でChatGPTを上回り-中国語の文章から Nine-year-old children outperformed ChatGPT in emotion: Evidence from Chinese writing ( http://arxiv.org/abs/2310.00578v2 ) ライセンス: Link先を確認	Siyi Cao, Yizhong Xu, Tongquan Zhou, Siruo Zhou,	(参考訳) 近年の研究では、ChatGPTは複雑な人間のようなテキストを生成する能力を持つことが実証されており、心的タスクの理論におけるその性能は、9歳の子供に匹敵するものであることが確認されている。しかし、ChatGPTが中国語の筆記能力で9歳の子供を上回っているかどうかは不明である。そこで本研究では,ChatGPTと9歳児のナラティブと科学の両面から,ChatGPTの相対的な強みと弱さを明らかにすることを目的として,中国語の筆記能力について検討した。収集したデータは、流布度、精度、複雑さ、凝集度、感情の5つの言語次元で分析された。各次元は正確な指標によって評価された。以上の結果から,9歳児は書字の流布度や結束度においてChatGPT以上に優れていた。一方,ChatGPTは,子どもに比べて精度が優れていた。複雑性に関して、子どもたちは科学をテーマとした執筆において優れたスキルを示し、一方でChatGPTは自然をテーマとした執筆において優位に立った。この研究は、中国の作文において、9歳の子供がChatGPTよりも強い感情を伝えることを明らかにする先駆的な研究である。 ChatGPT has been demonstrated to possess significant capabilities in generating intricate, human-like text, and recent studies have established that its performance in theory of mind tasks is comparable to that of a nine-year-old child. However, it remains uncertain whether ChatGPT surpasses nine-year-old children in Chinese writing proficiency. To explore this, our study juxtaposed the Chinese writing performance of ChatGPT and nine-year-old children on both narrative and scientific topics, aiming to uncover the relative strengths and weaknesses of ChatGPT in writing. The collected data were analyzed across five linguistic dimensions: fluency, accuracy, complexity, cohesion, and emotion. Each dimension underwent assessment through precise indices. The findings revealed that nine-year-old children excelled beyond ChatGPT in terms of fluency and cohesion within their writing. In contrast, ChatGPT manifested a superior performance in accuracy compared to the children. Concerning complexity, children exhibited superior skills in science-themed writing, while ChatGPT prevailed in nature-themed writing. Significantly, this research is pioneering in revealing that nine-year-old children convey stronger emotions than ChatGPT in their Chinese compositions.	公開日:2024-09-24 翻訳日:2024-11-09 10:12:15
# 大規模言語モデル生成データのソース属性 Source Attribution for Large Language Model-Generated Data ( http://arxiv.org/abs/2310.00646v2 ) ライセンス: Link先を確認	Jingtan Wang, Xinyang Lu, Zitong Zhao, Zhongxiang Dai, Chuan-Sheng Foo, See-Kiong Ng, Bryan Kian Hsiang Low,	(参考訳) LLM(Large Language Models)の印象的なパフォーマンスと商業化の可能性は、トレーニングデータの知的財産権(IP)に対する深刻な懸念を引き起こしている。特に、LLMによって生成された合成テキストは、LLMのトレーニングに使用されるデータのIPを侵害する可能性がある。この目的のために、LLMによる合成テキストの生成に寄与したデータ提供者を特定することにより、ソース属性を実現できることが不可欠である。そこで本稿では,LLMが電子透かしを組み込んだ合成テキストを作成できるようにし,電子透かしによってこの問題に対処できることを述べる。このようなウォーターマーキングフレームワーク(例えば、ソース属性の精度、敵に対するロバスト性)の鍵となる特性を特定し、アルゴリズム設計によりこれらの重要な特性を満たすソース属性フレームワークを提案する。我々のフレームワークは,LLMが生成したテキストからデータ提供者への正確なマッピングを学習することを可能にする。大規模な実証実験により,本フレームワークが効果的な情報源属性を達成できることが示唆された。 The impressive performances of Large Language Models (LLMs) and their immense potential for commercialization have given rise to serious concerns over the Intellectual Property (IP) of their training data. In particular, the synthetic texts generated by LLMs may infringe the IP of the data being used to train the LLMs. To this end, it is imperative to be able to perform source attribution by identifying the data provider who contributed to the generation of a synthetic text by an LLM. In this paper, we show that this problem can be tackled by watermarking, i.e., by enabling an LLM to generate synthetic texts with embedded watermarks that contain information about their source(s). We identify the key properties of such watermarking frameworks (e.g., source attribution accuracy, robustness against adversaries), and propose a source attribution framework that satisfies these key properties due to our algorithmic designs. Our framework enables an LLM to learn an accurate mapping from the generated texts to data providers, which sets the foundation for effective source attribution. Extensive empirical evaluations show that our framework achieves effective source attribution.	公開日:2024-09-25 翻訳日:2024-11-09 10:12:15
# すべてのデータセット数:ジョイントデータセットトレーニングによる単眼3Dオブジェクト検出のスケールアップ Every Dataset Counts: Scaling up Monocular 3D Object Detection with Joint Datasets Training ( http://arxiv.org/abs/2310.00920v4 ) ライセンス: Link先を確認	Fulong Ma, Xiaoyang Yan, Guoyang Zhao, Xiaojie Xu, Yuxuan Liu, Jun Ma, Ming Liu,	(参考訳) モノクロ3D物体検出は、自律運転において重要な役割を果たす。しかし、既存のモノクル3D検出アルゴリズムは、LiDAR測定から派生した3Dラベルに依存している。具体的には,多種多様な3次元および2次元データセットを用いたモノクロ3次元物体検出モデルの学習パイプラインについて検討した。提案フレームワークは,(1)様々なカメラ設定にまたがって機能するロバストなモノクル3Dモデル,(2)異なるクラスアノテーションでデータセットを適応するための選択学習戦略,(3)2Dラベルを用いた擬似3Dトレーニング手法により,2Dラベルのみを含むシーンにおける検出性能を向上させる。このフレームワークにより、様々なオープンな3D/2Dデータセットのジョイントセット上でモデルをトレーニングし、より強力な一般化能力を持つモデルと、2Dラベルのみを持つ新しいデータセットの性能を向上させることができる。我々はKITTI/nuScenes/ONCE/Cityscapes/BDD100Kデータセットに関する広範な実験を行い、提案手法のスケーリング能力を実証した。 Monocular 3D object detection plays a crucial role in autonomous driving. However, existing monocular 3D detection algorithms depend on 3D labels derived from LiDAR measurements, which are costly to acquire for new datasets and challenging to deploy in novel environments. Specifically, this study investigates the pipeline for training a monocular 3D object detection model on a diverse collection of 3D and 2D datasets. The proposed framework comprises three components: (1) a robust monocular 3D model capable of functioning across various camera settings, (2) a selective-training strategy to accommodate datasets with differing class annotations, and (3) a pseudo 3D training approach using 2D labels to enhance detection performance in scenes containing only 2D labels. With this framework, we could train models on a joint set of various open 3D/2D datasets to obtain models with significantly stronger generalization capability and enhanced performance on new dataset with only 2D labels. We conduct extensive experiments on KITTI/nuScenes/ONCE/Cityscapes/BDD100K datasets to demonstrate the scaling ability of the proposed method.	公開日:2024-09-24 翻訳日:2024-11-09 10:12:15
# 誘引子ダイナミクスによる離散的、構成的、象徴的表現 Discrete, compositional, and symbolic representations through attractor dynamics ( http://arxiv.org/abs/2310.01807v2 ) ライセンス: Link先を確認	Andrew Nam, Eric Elmoznino, Nikolay Malkin, James McClelland, Yoshua Bengio, Guillaume Lajoie,	(参考訳) シンボリックシステムは、人間の推論と行動の多くの側面に根ざしたルールと関係をカプセル化するので、認知過程をモデル化するための強力なフレームワークである。これらのモデルの中心は、体系性、構成性、生産性であり、認知科学と人工知能の両方において貴重である。しかし、いくつかの制限が残っている。例えば、構造化された記号過程と潜在サブシンボル過程の統合は、量子化やソフトマックスサンプリングのようなフィアット手法によって計算レベルで実装されている。そこで本研究では,思考の確率的言語(PLoT)に似た認知過程をモデル化するために,アトラクタダイナミクスを記号表現と統合した新しいニューラル確率力学系モデルを提案する。我々のモデルは、連続表現空間を、事前定義されたプリミティブに頼るのではなく、教師なし学習を通じて、記号系の意味性と構成性の特徴を反映する、記号列に対応する引き付け状態を持つ離散盆地に分割する。さらに、PLoTと同様に、入力データとシンボルエンコーディングの相互情報を反映したアトラクタ状態の多種多様な分布のサンプルを学習する。このアプローチは、認知操作の複雑な双対性を反映したより包括的なモデルを提供する、AIで表現力の証明された神経弁別可能な基質であるニューラルダイナミクスを通じて、シンボル処理とサブシンボル処理の両方を統合する統一的なフレームワークを確立する。 Symbolic systems are powerful frameworks for modeling cognitive processes as they encapsulate the rules and relationships fundamental to many aspects of human reasoning and behavior. Central to these models are systematicity, compositionality, and productivity, making them invaluable in both cognitive science and artificial intelligence. However, certain limitations remain. For instance, the integration of structured symbolic processes and latent sub-symbolic processes has been implemented at the computational level through fiat methods such as quantization or softmax sampling, which assume, rather than derive, the operations underpinning discretization and symbolicization. In this work, we introduce a novel neural stochastic dynamical systems model that integrates attractor dynamics with symbolic representations to model cognitive processes akin to the probabilistic language of thought (PLoT). Our model segments the continuous representational space into discrete basins, with attractor states corresponding to symbolic sequences, that reflect the semanticity and compositionality characteristic of symbolic systems through unsupervised learning, rather than relying on pre-defined primitives. Moreover, like PLoT, our model learns to sample a diverse distribution of attractor states that reflect the mutual information between the input data and the symbolic encodings. This approach establishes a unified framework that integrates both symbolic and sub-symbolic processing through neural dynamics, a neuro-plausible substrate with proven expressivity in AI, offering a more comprehensive model that mirrors the complex duality of cognitive operations.	公開日:2024-09-26 翻訳日:2024-11-09 10:12:15
# 会話型健康エージェント:パーソナライズされたLDM駆動エージェントフレームワーク Conversational Health Agents: A Personalized LLM-Powered Agent Framework ( http://arxiv.org/abs/2310.02374v5 ) ライセンス: Link先を確認	Mahyar Abbasian, Iman Azimi, Amir M. Rahmani, Ramesh Jain,	(参考訳) 会話型健康エージェント(英: Conversational Health Agents、CHA)は、援助や診断などの医療サービスを提供する対話型システムである。現在のCHA、特にLLM(Large Language Models)を利用するものは、主に会話の側面に焦点を当てています。しかし、彼らは限られたエージェント機能を提供し、特にマルチステップの問題解決、パーソナライズされた会話、マルチモーダルデータ分析を欠いている。私たちの目標はこれらの制限を克服することです。我々は,対話エージェントがユーザの医療クエリに対してパーソナライズされた応答を生成するために,オープンソースのLLMフレームワークであるopenCHAを提案する。このフレームワークにより、開発者はデータソース、知識ベース、分析モデルを含む外部ソースをLLMベースのソリューションに統合できる。 openCHAには、外部ソースからの情報を収集するためのアクションを計画し実行するためのオーケストレータが含まれている。知識獲得、問題解決機能、多言語とマルチモーダルの会話を促進し、さまざまなAIプラットフォームとのインタラクションを促進する。 2つのデモと4つのユースケースを通じて、複雑なヘルスケアタスクを扱うためのフレームワークの能力について説明する。さらに、GitHubを通じてコミュニティが利用可能なオープンソースとしてopenCHAをリリースしています。 Conversational Health Agents (CHAs) are interactive systems that provide healthcare services, such as assistance and diagnosis. Current CHAs, especially those utilizing Large Language Models (LLMs), primarily focus on conversation aspects. However, they offer limited agent capabilities, specifically lacking multi-step problem-solving, personalized conversations, and multimodal data analysis. Our aim is to overcome these limitations. We propose openCHA, an open-source LLM-powered framework, to empower conversational agents to generate a personalized response for users' healthcare queries. This framework enables developers to integrate external sources including data sources, knowledge bases, and analysis models, into their LLM-based solutions. openCHA includes an orchestrator to plan and execute actions for gathering information from external sources, essential for formulating responses to user inquiries. It facilitates knowledge acquisition, problem-solving capabilities, multilingual and multimodal conversations, and fosters interaction with various AI platforms. We illustrate the framework's proficiency in handling complex healthcare tasks via two demonstrations and four use cases. Moreover, we release openCHA as open source available to the community via GitHub.	公開日:2024-09-25 翻訳日:2024-11-09 10:12:15
# 原子アンサンブルにおける同時スピンスクイーズと光スクイーズ Concurrent spin squeezing and light squeezing in an atomic ensemble ( http://arxiv.org/abs/2310.02493v2 ) ライセンス: Link先を確認	Shenchao Jin, Junlei Duan, Youwei Zhang, Xichang Zhang, Han Bao, Heng Shen, Liantuan Xiao, Suotang Jia, Mingfeng Wang, Yanhong Xiao,	(参考訳) スクイーズスピン状態とスクイーズ光は量子力学と量子情報科学の鍵となる資源であるが、これまでの実験では別々に研究されてきた。この2つのタイプの量子状態の同時生成は興味深いが、依然として挑戦的な目標である。本稿では, 偏光相互作用に基づく新しいプロトコルを提案し, 0.61\pm0.09~\mathrm{dB}$および0.65^{+0.11}_{-0.10}〜\mathrm{dB}$の同時スピンスクイーズと, 熱原子アンサンブルにおける光スクイーズを同時に行う実験結果について報告する。スクイーズ過程は決定論的であり、光場と集合原子スピンの両方に対して固定されたスクイーズ方向を与える。さらに、圧縮光モードは1つの空間モードの多重周波数側バンドに配置される。この新しいタイプの二重圧縮状態は、量子強化量子論と量子ネットワークに適用できる。我々の方法は、光学、低温原子、閉じ込められたイオンなどの他の量子プラットフォームに拡張することができる。 Squeezed spin states and squeezed light are both key resources for quantum metrology and quantum information science, but have been separately investigated in experiments so far. Simultaneous generation of these two types of quantum states in one experiment setup is intriguing but remains a challenging goal. Here we propose a novel protocol based on judiciously engineered symmetric atom-light interaction, and report proof-of-principle experimental results of concurrent spin squeezing of $0.61\pm0.09~\mathrm{dB}$ and light squeezing of $0.65^{+0.11}_{-0.10}~\mathrm{dB}$ in a hot atomic ensemble. The squeezing process is deterministic, yielding fixed squeezing directions for both the light field and the collective atomic spin. Furthermore, the squeezed light modes lie in the multiple frequency sidebands of a single spatial mode. This new type of dual squeezed state is applicable for quantum enhanced metrology and quantum networks. Our method can be extended to other quantum platforms such as optomechanics, cold atom and trapped ions.	公開日:2024-09-24 翻訳日:2024-11-09 10:12:15
# 物理インフォームドニューラルネットワークを用いた多相流中遠心ポンプの学習特性パラメータとダイナミクス Learning characteristic parameters and dynamics of centrifugal pumps under multiphase flow using physics-informed neural networks ( http://arxiv.org/abs/2310.03001v2 ) ライセンス: Link先を確認	Felipe de Castro Teixeira Carvalho, Kamaljyoti Nath, Alberto Luiz Serpa, George Em Karniadakis,	(参考訳) 電気式潜水ポンプ(ESP)は、石油・ガス産業において人工揚力システムとして広く利用されている。これらのポンプは、炭化水素、水、堆積物の複雑な混合物からなる多相流に頻繁に遭遇する。このような混合物はエマルションの形成につながり、個々の相とは異なる有効粘性によって特徴づけられる。これらの条件を評価するために使用される従来の多相流量計は、高い運用コストと劣化に対する感受性によって負担される。そこで本研究では,ESPシステムの流体特性,動的状態,重要なパラメータを間接的に推定する物理インフォームドニューラルネットワーク(PINN)モデルを提案する。ポンプからの吸気・吐出圧力測定を用いて, 確実に推定できるパラメータのサブセットについて, 包括的構造的, 実用的識別可能性分析を行った。 PINNモデルの有効性は,これらの圧力測定を入力データとして,未知の状態とパラメータを推定することによって検証した。さらに, 各種含水シナリオのシミュレーションデータと実験データを用いて, PINNモデルの性能を粒子フィルタ法と比較した。比較分析の結果, PINNモデルは従来の多相流速計の代替として有望な可能性を秘めており, 運用効率の向上とESPアプリケーションのコスト削減に期待できる道筋となっている。 Electrical submersible pumps (ESPs) are prevalently utilized as artificial lift systems in the oil and gas industry. These pumps frequently encounter multiphase flows comprising a complex mixture of hydrocarbons, water, and sediments. Such mixtures lead to the formation of emulsions, characterized by an effective viscosity distinct from that of the individual phases. Traditional multiphase flow meters, employed to assess these conditions, are burdened by high operational costs and susceptibility to degradation. To this end, this study introduces a physics-informed neural network (PINN) model designed to indirectly estimate the fluid properties, dynamic states, and crucial parameters of an ESP system. A comprehensive structural and practical identifiability analysis was performed to delineate the subset of parameters that can be reliably estimated through the use of intake and discharge pressure measurements from the pump. The efficacy of the PINN model was validated by estimating the unknown states and parameters using these pressure measurements as input data. Furthermore, the performance of the PINN model was benchmarked against the particle filter method utilizing both simulated and experimental data across varying water content scenarios. The comparative analysis suggests that the PINN model holds significant potential as a viable alternative to conventional multiphase flow meters, offering a promising avenue for enhancing operational efficiency and reducing costs in ESP applications.	公開日:2024-09-23 翻訳日:2024-11-09 10:12:15
# 構造対応レコメンデーションインベディング進化のためのグラフ付最適化器 Graph-enhanced Optimizers for Structure-aware Recommendation Embedding Evolution ( http://arxiv.org/abs/2310.03032v3 ) ライセンス: Link先を確認	Cong Xu, Jun Wang, Jianyong Wang, Wei Zhang,	(参考訳) 埋め込みは、現実世界の実体の仮想表現であり、その後の意思決定モデルの基礎であるため、現代のレコメンデーションシステムにおいて重要な役割を果たす。本稿では,新しい組込み更新機構であるSEvo(Structure-aware Embedding Evolution)を提案する。通常、中間モジュールとして機能するGNN(Graph Neural Network)とは異なり、SEvoはトレーニング中に最小の計算オーバーヘッドでグラフ構造情報を埋め込みに直接注入することができる。 SEvoの収束特性とその潜在的な変種は、設計の有効性を正当化するために理論的に解析される。さらに、SEvoは最先端のパフォーマンスのために既存のオプティマイザにシームレスに統合できる。特に、モーメント推定補正を施したSevo強化AdamWは、モデルとデータセットの範囲で一貫した改善を示し、明示的なGNNモジュールを超えてグラフ構造情報を効果的に活用する新たな技術経路を示唆している。 Embedding plays a key role in modern recommender systems because they are virtual representations of real-world entities and the foundation for subsequent decision-making models. In this paper, we propose a novel embedding update mechanism, Structure-aware Embedding Evolution (SEvo for short), to encourage related nodes to evolve similarly at each step. Unlike GNN (Graph Neural Network) that typically serves as an intermediate module, SEvo is able to directly inject graph structural information into embedding with minimal computational overhead during training. The convergence properties of SEvo along with its potential variants are theoretically analyzed to justify the validity of the designs. Moreover, SEvo can be seamlessly integrated into existing optimizers for state-of-the-art performance. Particularly SEvo-enhanced AdamW with moment estimate correction demonstrates consistent improvements across a spectrum of models and datasets, suggesting a novel technical route to effectively utilize graph structural information beyond explicit GNN modules.	公開日:2024-09-27 翻訳日:2024-11-09 10:12:15
# 非滑らか弱凸有限和結合合成最適化 Non-Smooth Weakly-Convex Finite-sum Coupled Compositional Optimization ( http://arxiv.org/abs/2310.03234v5 ) ライセンス: Link先を確認	Quanqi Hu, Dixian Zhu, Tianbao Yang,	(参考訳) 本稿では,新しい合成最適化問題である$\underline{\bf n}$on-$\underline{\bf s}$mooth $\underline{\bf w}$eakly-$\underline{\bf c}$onvex $\underline{\bf f}$inite-sum $\underline{\bf c}$oupled $\underline{\bf c}$ompositional $\underline{\bf o}$ptimization (NSWC FCCO)について検討する。機械学習とAIの幅広い応用と、経験的リスク最小化に基づく確率的アルゴリズムの欠点に対処する能力により、FCCOへの関心が高まっている。しかし、FCCOの最近の研究は、内部関数と外部関数の両方が滑らかであり、より多様な問題に取り組む可能性を制限すると仮定している。本研究は, 外部関数が弱凸で非減少し, 内関数が弱凸である非滑らかなFCCOを調べることにより, この領域を拡大する。単ループアルゴリズムを解析し、目的関数のモロー展開の$\epsilon$-stationary点を求める複雑性を確立する。さらに,3つの関数のネスト配置を特徴とする,非滑らかな弱凸三値有限サム結合合成最適化問題にもアルゴリズムを拡張した。最後に,2方向部分AUC最大化と多方向部分AUC最大化のためのディープラーニングへのアルゴリズムの適用について,実験的検討を用いて検討した。 This paper investigates new families of compositional optimization problems, called $\underline{\bf n}$on-$\underline{\bf s}$mooth $\underline{\bf w}$eakly-$\underline{\bf c}$onvex $\underline{\bf f}$inite-sum $\underline{\bf c}$oupled $\underline{\bf c}$ompositional $\underline{\bf o}$ptimization (NSWC FCCO). There has been a growing interest in FCCO due to its wide-ranging applications in machine learning and AI, as well as its ability to address the shortcomings of stochastic algorithms based on empirical risk minimization. However, current research on FCCO presumes that both the inner and outer functions are smooth, limiting their potential to tackle a more diverse set of problems. Our research expands on this area by examining non-smooth weakly-convex FCCO, where the outer function is weakly convex and non-decreasing, and the inner function is weakly-convex. We analyze a single-loop algorithm and establish its complexity for finding an $\epsilon$-stationary point of the Moreau envelop of the objective function. Additionally, we also extend the algorithm to solving novel non-smooth weakly-convex tri-level finite-sum coupled compositional optimization problems, which feature a nested arrangement of three functions. Lastly, we explore the applications of our algorithms in deep learning for two-way partial AUC maximization and multi-instance two-way partial AUC maximization, using empirical studies to showcase the effectiveness of the proposed algorithms.	公開日:2024-09-24 翻訳日:2024-11-09 10:12:15
# リレーショナル・コンボリューションによる階層的関係表現の学習 Learning Hierarchical Relational Representations through Relational Convolutions ( http://arxiv.org/abs/2310.03240v3 ) ライセンス: Link先を確認	Awni Altabaa, John Lafferty,	(参考訳) ディープラーニングの研究分野は、関係的特徴表現の学習を支援するアーキテクチャと帰納的バイアスの研究である。本稿では,階層的関係の表現を学習する上での課題,すなわちオブジェクト群間の高次関係パターンについて述べる。本稿では,単純なモジュールを構成することで,より複雑な関係性を段階的に捉える計算機構を備えたニューラルネットワークである「リレーショナル畳み込みネットワーク」を紹介する。このフレームワークの重要なコンポーネントは、グラフレットフィルタを結合することで、オブジェクトのグループ内のリレーショナルパターンをキャプチャする新しい操作である。関係的畳み込みを構成することは、高次の階層的関係の表現を学ぶ深いアーキテクチャをもたらす。アーキテクチャのモチベーションと詳細、およびリレーショナル畳み込みネットワークが階層構造を持つリレーショナルタスクをモデル化するための効果的なフレームワークを提供するための一連の実験を示す。 An evolving area of research in deep learning is the study of architectures and inductive biases that support the learning of relational feature representations. In this paper, we address the challenge of learning representations of hierarchical relations--that is, higher-order relational patterns among groups of objects. We introduce "relational convolutional networks", a neural architecture equipped with computational mechanisms that capture progressively more complex relational features through the composition of simple modules. A key component of this framework is a novel operation that captures relational patterns in groups of objects by convolving graphlet filters--learnable templates of relational patterns--against subsets of the input. Composing relational convolutions gives rise to a deep architecture that learns representations of higher-order, hierarchical relations. We present the motivation and details of the architecture, together with a set of experiments to demonstrate how relational convolutional networks can provide an effective framework for modeling relational tasks that have hierarchical structure.	公開日:2024-09-26 翻訳日:2024-11-09 10:12:15
# データ依存結合を持つ確率補間子 Stochastic interpolants with data-dependent couplings ( http://arxiv.org/abs/2310.03725v3 ) ライセンス: Link先を確認	Michael S. Albergo, Mark Goldstein, Nicholas M. Boffi, Rajesh Ranganath, Eric Vanden-Eijnden,	(参考訳) フローや拡散のような測度の動的輸送にインスパイアされた生成モデルは、2つの確率密度の間の連続時間マップを構築する。従来、これらのうちの1つはターゲット密度であり、サンプルを通してのみアクセス可能であり、もう1つはデータに依存しない単純な基底密度と見なされている。本研究では,確率的補間子の枠組みを用いて,ベースとターゲット密度の \textit{couple} を定式化する。そこで,ベースからのサンプルを,クラスラベルや連続埋め込みに関する情報を組み込んだ(ただし妨げない)方法で,ターゲットからのサンプルを条件付きで計算する。これにより、条件付き生成モデルとして機能する動的トランスポートマップを構築することができる。これらのトランスポートマップは、標準的な独立な設定に類似した単純な2乗損失回帰問題を解くことで学習可能であることを示す。超高分解能および in-painting の実験を通じて, 実際に依存結合を構築することの有用性を実証する。 Generative models inspired by dynamical transport of measure -- such as flows and diffusions -- construct a continuous-time map between two probability densities. Conventionally, one of these is the target density, only accessible through samples, while the other is taken as a simple base density that is data-agnostic. In this work, using the framework of stochastic interpolants, we formalize how to \textit{couple} the base and the target densities, whereby samples from the base are computed conditionally given samples from the target in a way that is different from (but does preclude) incorporating information about class labels or continuous embeddings. This enables us to construct dynamical transport maps that serve as conditional generative models. We show that these transport maps can be learned by solving a simple square loss regression problem analogous to the standard independent setting. We demonstrate the usefulness of constructing dependent couplings in practice through experiments in super-resolution and in-painting.	公開日:2024-09-23 翻訳日:2024-11-09 10:12:15

Title

Authors

Abstract

論文公表日・翻訳日

# 脳モデルとしての概念価値ネットワーク

A Concept-Value Network as a Brain Model ( http://arxiv.org/abs/1904.04579v6 )

ライセンス: Link先を確認

Kieran Greer,

(参考訳) 本稿では,脳様モデルの物理的実体と概念的実体の関係を記述するための統計的枠組みを提案する。特徴と概念のインスタンスはコンテキストに置かれ、化学接続も可能であるが、この論文は特徴が電気配線である可能性を示唆している。この考え方では、実際の接続長は、発射速度とニューロン同期と関係があるため重要であるが、信号タイプはそれほど重要ではない。この論文は、概念が特徴集合と概念インスタンスをリンクするニューロン群であり、それらのグループからの化学信号によって決定されることを示唆している。したがって、特徴はニューラルネットワークの静的水平フレームワークとなり、概念はこれらを垂直に相互に結合する。機能に関して、ニューロンは機能的と考えられ、より水平な記憶構造はグリアとなる。これはまた、機能が分散エンティティであり、単一の領域に集中していないことを示唆する。もう一つの側面は、パターンを分解し、神経結合に役立つシグナル「ブレーク」である。

This paper suggests a statistical framework for describing the relations between the physical and conceptual entities of a brain-like model. Features and concept instances are put into context, where the paper suggests that features may be the electrical wiring, although chemical connections are also possible. With this idea, the actual length of the connection is important, because it is related to firing rates and neuron synchronization, but the signal type is less important. The paper then suggests that concepts are neuron groups that link feature sets and concept instances are determined by chemical signals from those groups. Therefore, features become the static horizontal framework of the neural system and concepts are vertically interconnected combinations of these. With regards to functionality, the neuron is then considered to be functional and the more horizontal memory structures can even be glial. This would also suggest that features can be distributed entities and not concentrated to a single area. Another aspect could be signal 'breaks' that compartmentalise a pattern and may help with neural binding.

公開日:2024-09-26
翻訳日:2024-11-09 16:01:17

# 確率的要求による車両経路問題のゲーミフィケーション

Gamifying the Vehicle Routing Problem with Stochastic Requests ( http://arxiv.org/abs/1911.05922v2 )

ライセンス: Link先を確認

Nicholas D. Kullman, Nikita Dudorov, Jorge E. Mendoza, Martin Cousineau, Justin C. Goodson,

(参考訳) あなたの最初のビデオゲームコンソールを覚えていますか。私たちは自分のことを思い出す。数十年前、彼らは何時間もエンターテイメントを提供していた。現在、動的および確率的最適化問題を解くためにそれらを再利用している。幅広いアタリゲームに超人的パフォーマンスをポストする深層強化学習手法により,古典的な物流問題をゲームとして表現する作業を考える。その後、エージェントを訓練してプレイします。確率的要求を伴う車両経路問題のゲーム設計について検討する。パースペクティブ、視野、ミニマップなど、さまざまなデザイン特徴がエージェントのパフォーマンスにどのように影響するかを示す。適切なゲーム設計では、一般的な目的であるAtariエージェントは、特に問題のサイズが大きくなるにつれて、最適化ベースのベンチマークを上回ります。我々の研究は、ゲームによる動的および確率的最適化問題の表現を、有望な研究方向として示している。

Do you remember your first video game console? We remember ours. Decades ago, they provided hours of entertainment. Now, we have repurposed them to solve dynamic and stochastic optimization problems. With deep reinforcement learning methods posting superhuman performance on a wide range of Atari games, we consider the task of representing a classic logistics problem as a game. Then, we train agents to play it. We consider several game designs for the vehicle routing problem with stochastic requests. We show how various design features impact agents' performance, including perspective, field of view, and minimaps. With the right game design, general purpose Atari agents outperform optimization-based benchmarks, especially as problem size grows. Our work points to the representation of dynamic and stochastic optimization problems via games as a promising research direction.

公開日:2024-09-23
翻訳日:2024-11-09 16:01:17

# 脳発見のためのマルチレゾリューショングラフエッジ埋め込みの学習神経疾患におけるネットワーク機能障害

Learning Multi-resolution Graph Edge Embedding for Discovering Brain Network Dysfunction in Neurological Disorders ( http://arxiv.org/abs/1912.01181v1 )

ライセンス: Link先を確認

Xin Ma, Guorong Wu, Seong Jae Hwang, Won Hwa Kim

(参考訳) 最近の異種の文献では、異なる脳領域、すなわち脳の接続が神経疾患の早期症状をもたらすことが示されている。グラフニューラルネットワーク(GNN)技術に対する大きな取り組みにも関わらず、グラフノードに重点を置いているため、現在の最先端のGNNメソッドは、グラフリンク上の疾患関連ネットワーク障害パターンを特徴付けることを目的としたグラフとして、脳接続を分類するのに適さない。この問題に対処するために,診断カテゴリ間で高い判別能力を有する病原性結合性ベンチマークを検出するためのマルチレゾリューションエッジネットワーク(MENET)を提案する。 MENETの中核は、我々が提案する新しいグラフエッジワイド変換であり、マルチ解像度 ``connectomic'' 機能をキャプチャすることができる。連結特徴の豊富な集合を用いて、識別エッジを共同で選択し、グラフの診断ラベルを割り当てるグラフ学習フレームワークを考案する。 2つの実際のデータセットでの実験により、MENETは診断ラベルを正確に予測し、アルツハイマー病や注意・抑止・多動性障害などの神経疾患と密接に関連している脳の結合性を特定する。

Tremendous recent literature show that associations between different brain regions, i.e., brain connectivity, provide early symptoms of neurological disorders. Despite significant efforts made for graph neural network (GNN) techniques, their focus on graph nodes makes the state-of-the-art GNN methods not suitable for classifying brain connectivity as graphs where the objective is to characterize disease-relevant network dysfunction patterns on graph links. To address this issue, we propose Multi-resolution Edge Network (MENET) to detect disease-specific connectomic benchmarks with high discrimination power across diagnostic categories. The core of MENET is a novel graph edge-wise transform that we propose, which allows us to capture multi-resolution ``connectomic'' features. Using a rich set of the connectomic features, we devise a graph learning framework to jointly select discriminative edges and assign diagnostic labels for graphs. Experiments on two real datasets show that MENET accurately predicts diagnostic labels and identify brain connectivities highly associated with neurological disorders such as Alzheimer's Disease and Attention-Deficit/Hyperactivity Disorder.

公開日:2024-09-26
翻訳日:2024-11-09 16:01:17

# 神経障害における脳ネットワーク障害発見のための多分解能グラフエッジ埋め込みの学習

Learning Multi-resolution Graph Edge Embedding for Discovering Brain Network Dysfunction in Neurological Disorders ( http://arxiv.org/abs/1912.01181v2 )

ライセンス: Link先を確認

Xin Ma, Guorong Wu, Seong Jae Hwang, Won Hwa Kim,

公開日:2024-09-26
翻訳日:2024-11-09 15:57:56

# 教師なしの学習表現:クエストは終わりか?

Unsupervisedly Learned Representations: Should the Quest be Over? ( http://arxiv.org/abs/2001.07495v1 )

ライセンス: Link先を確認

Daniel N. Nissani (Nissensohn)

(参考訳) 研究から40年経っても、最良の教師なし学習表現法と知的動物が達成した精度率との間には、およそ20%の分類精度のギャップが残っている。したがって、間違った方向を向いているのかもしれない。このパズルの解法が提示される。強化学習が動物と同じ精度の表現を学習できることを実証する。私たちの主な貢献は、以下の観察にある。 a) 実環境に適用する場合は、強化学習はラベルを必要としないため、正当に教師なし学習とみなすことができる。対照的に、強化学習をシミュレーション環境で適用する場合は、本質的にラベルを必要とするため、一般的には監督学習とみなすべきである。これらの観察の要点は、シミュレーション環境で訓練される可能性のある教師なし学習の競争パラダイムのさらなる探索が無駄になる可能性があるということである。

After four decades of research there still exists a Classification accuracy gap of about 20% between our best Unsupervisedly Learned Representations methods and the accuracy rates achieved by intelligent animals. It thus may well be that we are looking in the wrong direction. A possible solution to this puzzle is presented. We demonstrate that Reinforcement Learning can learn representations which achieve the same accuracy as that of animals. Our main modest contribution lies in the observations that: a. when applied to a real world environment Reinforcement Learning does not require labels, and thus may be legitimately considered as Unsupervised Learning, and b. in contrast, when Reinforcement Learning is applied in a simulated environment it does inherently require labels and should thus be generally be considered as Supervised Learning. The corollary of these observations is that further search for Unsupervised Learning competitive paradigms which may be trained in simulated environments may be futile.

公開日:2024-09-26
翻訳日:2024-11-09 15:57:56

# 教師なしの学習表現:クエストは終わりか?

Unsupervisedly Learned Representations: Should the Quest be Over? ( http://arxiv.org/abs/2001.07495v4 )

ライセンス: Link先を確認

Daniel N. Nissani,

公開日:2024-09-26
翻訳日:2024-11-09 15:57:56

# 教師なしの学習表現:クエストは終わりか?

Unsupervisedly Learned Representations: Should the Quest be Over? ( http://arxiv.org/abs/2001.07495v5 )

ライセンス: Link先を確認

Daniel N. Nissani,

公開日:2024-09-26
翻訳日:2024-11-09 15:57:56

# 代数的クリプトアナリシスに関するフォーマルパワーシリーズ

Formal Power Series on Algebraic Cryptanalysis ( http://arxiv.org/abs/2007.14729v3 )

ライセンス: Link先を確認

Shuhei Nakamura,

(参考訳) 多項式方程式の系を解くための暗号系を減少させる攻撃の複雑性推定において、第1の転落次数の正則度と上界は、しばしば暗号解析において用いられる。正則性の次数は半正則性仮定の下で単変量形式列を用いて容易に計算できるが、第1の転位次数の上界を決定するためには、入力システムの具体的なシジーを調べる必要がある。本稿では,多項式系における第1降下次数の上界を十分に大域にわたって検討する。この場合、非半正則系の第一降下次数は正則度で上界し、多階多項式系の第一落下次数は、多変量形式的級数列から決定される一定の値で上界することを示す。さらに、多項式系の最初の転倒次数を計算するための理論的な仮定を十分に大きな場上で提供する。

In the complexity estimation for an attack that reduces a cryptosystem to solving a system of polynomial equations, the degree of regularity and an upper bound of the first fall degree are often used in cryptanalysis. While the degree of regularity can be easily computed using a univariate formal power series under the semi-regularity assumption, determining an upper bound of the first fall degree requires investigating the concrete syzygies of an input system. In this paper, we investigate an upper bound of the first fall degree for a polynomial system over a sufficiently large field. In this case, we prove that the first fall degree of a non-semi-regular system is bounded above by the degree of regularity, and that the first fall degree of a multi-graded polynomial system is bounded above by a certain value determined from a multivariate formal power series. Moreover, we provide a theoretical assumption for computing the first fall degree of a polynomial system over a sufficiently large field.

公開日:2024-09-20
翻訳日:2024-11-09 15:57:56

# ゼロ知識ゲーム

Zero Knowledge Games ( http://arxiv.org/abs/2009.13521v7 )

ライセンス: Link先を確認

Ian Malloy,

(参考訳) 本稿では,不完全なリコールと不完全な情報によって,全ての戦略が不完全であるようなゲームをモデル化する。また,リニアトランスフォーメーションとして修正されたスライディングブロックコードを導入し,プレイヤーの公開発表時の情報伝達に関する共通知識を生成する。最終的に、2つのプレイヤーまたは2つの連立関係の間に、両方のプレイヤーに知らせられるゼロ知識ゲームは、混合戦略ナッシュ均衡に確立された信頼の効力を持つ。ゼロ知識ゲームは信頼と健全性の1つである。非インフォームドの選手の場合、そのようなプレイヤーは非インフォームドであることを明らかにする。検証の意思」は、クレームが繰り返し虚偽のクレームの責任を負ったり、非インフォームされたりすることがないように浸食されることがある。

In this paper we model a game such that all strategies are non-revealing, with imperfect recall and incomplete information. We also introduce a modified sliding-block code as a linear transformation which generates common knowledge of how informed a player is under public announcements. Ultimately, we see that between two players or two coalitions; zero-knowledge games where both players are informed have the utility of trust established in the mixed strategy Nash equilibrium. A zero-knowledge game is one of trust and soundness, placing utility in being informed. For any player who may be uninformed, such players reveal they are uninformed. The "will to verify" may be eroded such that the claimant is never held responsible for their repeated false claims or being uninformed.

公開日:2024-09-22
翻訳日:2024-11-09 15:57:56

# 無線360度ビデオストリーミングのためのクロス層最適化と分散強化学習

Cross Layer Optimization and Distributed Reinforcement Learning for Wireless 360° Video Streaming ( http://arxiv.org/abs/2011.06356v3 )

ライセンス: Link先を確認

Anis Elgabli, Mohammed S. Elbamby, Cristina Perfecto, Mounssif Krouka, Mehdi Bennis, Vaneet Aggarwal,

(参考訳) ワイヤレスで高画質の360度ビデオをストリーミングすることは、今でも難しい問題だ。異なる360度ビデオを見たり、コンピューティングや通信リソースに競合するユーザがたくさんいる場合、ストリーミングアルゴリズムは、各ユーザに対して最小限のレートを保証しながら、平均品質(QoE)を最大化すべきである。本稿では,各ユーザに対して利用可能なレートを最大化し,ユーザのQoEを最大化するために効率的に利用するクロスレイヤ最適化手法を提案する。特にタイルベースの360度ビデオストリーミングを検討し、各ユーザのQoEの最大化とユーザ間の公正性の確保とのトレードオフをバランスさせるQoEメトリックを最適化する。この問題を2つの相互関連サブプロブレムに分解できることを示す。一利用者毎のダウンロード率を見つけることを目的とする物理層サブプロブレム二利用者のQoEが最大になるように、そのレートを用いてタイルごとの品質判定を行うことを目的とするアプリケーション層サブプロブレム。物理層サブプロブレムを低複雑性で最適に解き、複数の独立エージェントの並列トレーニングを活用してアプリケーション層サブプロブレムを解くためにアクタ・クリティカル・ディープ・強化学習(DRL)を提案する。大規模な実験により,提案手法の頑健さが明らかになり,いくつかのベースラインアルゴリズムと比較して顕著な性能向上が示された。

Wirelessly streaming high quality 360 degree videos is still a challenging problem. When there are many users watching different 360 degree videos and competing for the computing and communication resources, the streaming algorithm at hand should maximize the average quality of experience (QoE) while guaranteeing a minimum rate for each user. In this paper, we propose a cross layer optimization approach that maximizes the available rate to each user and efficiently uses it to maximize users' QoE. Particularly, we consider a tile based 360 degree video streaming, and we optimize a QoE metric that balances the tradeoff between maximizing each user's QoE and ensuring fairness among users. We show that the problem can be decoupled into two interrelated subproblems: (i) a physical layer subproblem whose objective is to find the download rate for each user, and (ii) an application layer subproblem whose objective is to use that rate to find a quality decision per tile such that the user's QoE is maximized. We prove that the physical layer subproblem can be solved optimally with low complexity and an actor-critic deep reinforcement learning (DRL) is proposed to leverage the parallel training of multiple independent agents and solve the application layer subproblem. Extensive experiments reveal the robustness of our scheme and demonstrate its significant performance improvement compared to several baseline algorithms.

公開日:2024-09-24
翻訳日:2024-11-09 15:57:56

# チェッカーボード反強磁性体のほぼ退化状態とそのボソニック解釈

Nearly degenerate ground states of a checkerboard antiferromagnet and their bosonic interpretation ( http://arxiv.org/abs/2011.06520v2 )

ライセンス: Link先を確認

Haiyuan Zou, Fan Yang, Wei Ku,

(参考訳) J_1$-$J_2$チェッカーボード格子上の反強磁性(AF)カップリングを持つスピン-$1/2$モデル系は、平面ピロクロアモデルとして知られ、強いフラストレーションを伴い、2次元から1次元のクロスオーバーと結びついている。 Projected Entangled Simplex States tensor network ansatz を用いて、フラストレーション領域 (J_1<J_2$) におけるほぼ退化状態の多数を同定する。具体的には、長寿命クロスダイマー価結合固体(VBS)が、J_1\lesssim J_2$の基底状態であるのに対して、1D AF相関状態が残りを乗っ取る。ネマティック摂動に対するVBS状態の安定性を検証する。対応するボゾン像は低エネルギー物理学の直感的な理解を与える。特に,VBS状態がより弱いことを予測し,数値的に確認する。本研究は, この興味深いシステムの最も重要な基底状態特性を明らかにし, フラストレーション磁化処理におけるボゾン像の有用性を実証するものである。

The spin-$1/2$ model system with antiferromagnetic (AF) couplings on a $J_1$-$J_2$ checkerboard lattice, known as the planar pyrochlore model, is strongly frustrated and associated with a two-to-one dimensional crossover. Using the Projected Entangled Simplex States tensor network ansatz, we identify a large number of nearly degenerate states in the frustrated region ($J_1<J_2$). Specifically, we find the long-sought crossed-dimer valence bond solid (VBS) state to be the ground state at $J_1\lesssim J_2$, while various 1D AF correlated states take over the rest. We verify the stability of the VBS state against nematic perturbation. The corresponding bosonic picture provides an intuitive understanding of the low-energy physics. Particularly, it predicts weaker VBS states in the easy-plane limit, which we confirm numerically. Our results clarify the most essential ground state properties of this interesting system and demonstrate the usefulness of bosonic picture in dealing with frustrated magnetism.

公開日:2024-09-24
翻訳日:2024-11-09 15:57:56

# 高次元データに関する講義ノート

Lecture notes on high-dimensional data ( http://arxiv.org/abs/2101.05841v7 )

ライセンス: Link先を確認

Sven-Ake Wegner,

(参考訳) 以下は、2019-2020年にイギリスでBScの学生に教えた「数学データサイエンス」の講座の最初の部分に基づく講義ノートである。トピックは、高次元における測度集中、高次元におけるガウス確率ベクトル、乱射影、ガウスデータの分離・分離である。改訂版が教科書 (Mathematical Introduction to Data Science, Springer, Berlin, Heidelberg, 2024, https://link.springer.com/book/10.1007/978-3-662-69426-8] の一部として出版された。

These are lecture notes based on the first part of a course on 'Mathematical Data Science', which I taught to final year BSc students in the UK in 2019-2020. Topics include: concentration of measure in high dimensions; Gaussian random vectors in high dimensions; random projections; separation/disentangling of Gaussian data. A revised version has been published as part of the textbook [Mathematical Introduction to Data Science, Springer, Berlin, Heidelberg, 2024, https://link.springer.com/book/10.1007/978-3-662-69426-8].

公開日:2024-09-20
翻訳日:2024-11-09 15:57:56

# 加速法

Acceleration Methods ( http://arxiv.org/abs/2101.09545v1 )

ライセンス: Link先を確認

Alexandre d'Aspremont, Damien Scieur and Adrien Taylor

(参考訳) このモノグラフは、凸最適化に頻繁に使用される加速技術における最近の進歩をカバーしている。まず、2次最適化問題を用いて、モーメントとネスト最適化スキームという2つの主要な手法群を導入する。これらは二次の場合と一致してチェビシェフ法を形成する。モーメント法について、ネステロフのセミナルな研究から始まり、最適化された勾配法のようないくつかのマスターテンプレートを用いて構造収束証明を議論し、モーメント法が収束保証をいかに最適化するかを示す重要な利点を提供する。さらに、同様のアルゴリズムパターンを用いて、CatalystおよびAccelerated Hybrid Proximal Extragradientフレームワークの心臓部において、近位加速度をさらにカバーする。一般的な加速技術は、目の前の問題における正則性パラメータの知識に直接依存する。我々は、観測されない正則性パラメータに適応しつつ、ほぼ最適な収束率に達するための一連の簡単な手法である再起動スキームを議論することで結論付ける。

This monograph covers some recent advances in a range of acceleration techniques frequently used in convex optimization. We first use quadratic optimization problems to introduce two key families of methods, namely momentum and nested optimization schemes. They coincide in the quadratic case to form the Chebyshev method. We discuss momentum methods in detail, starting with the seminal work of Nesterov and structure convergence proofs using a few master templates, such as that for optimized gradient methods, which provide the key benefit of showing how momentum methods optimize convergence guarantees. We further cover proximal acceleration, at the heart of the Catalyst and Accelerated Hybrid Proximal Extragradient frameworks, using similar algorithmic patterns. Common acceleration techniques rely directly on the knowledge of some of the regularity parameters in the problem at hand. We conclude by discussing restart schemes, a set of simple techniques for reaching nearly optimal convergence rates while adapting to unobserved regularity parameters.

公開日:2024-09-24
翻訳日:2024-11-09 15:57:56

# 加速法

Acceleration Methods ( http://arxiv.org/abs/2101.09545v4 )

ライセンス: Link先を確認

Alexandre d'Aspremont, Damien Scieur, Adrien Taylor,

公開日:2024-09-24
翻訳日:2024-11-09 15:57:56

# フラクタル上のスピン-1/2ハイゼンベルク反強磁性体におけるギャップレススピン液体と非局所コーナー励起

Gapless Spin Liquid and Non-local Corner Excitation in the Spin-1/2 Heisenberg Antiferromagnet on Fractal ( http://arxiv.org/abs/2105.12487v2 )

ライセンス: Link先を確認

Haiyuan Zou, Wei Wang,

(参考訳) フラクタル系の数学的美しさと最近の実験的実現により、スピン-$1/2$反強磁性ハイゼンベルク模型をSierpi\nskiガスケット上で研究した。フラクタル多孔質の特徴は、エキゾチックな量子状態を示す新しい種類のフラストレーションを生み出す。先進テンソルネットワーク技術を用いて,分数空間次元における量子ギャップレス-スピン-液体基底状態の同定を行う。このフラクタルスピン系は非自明な非局所的性質も示している。超短距離相関は、非常に縮退したスピン形成因子を引き起こすが、このフラクタル系の絡み合いは長距離スケーリングの挙動を示唆している。また, 動的構造因子について検討し, 基底状態の絡み目から生じる安定なコーナー励起によるギャップレス励起を明らかにした。我々の結果は、このフラクタルスピンシステムの複数の重要な性質を不明瞭に指摘し、スピン液体とフラストレーション磁石を探索する新たな経路を開く。

Motivated by the mathematical beauty and the recent experimental realizations of fractal systems, we study the spin-$1/2$ antiferromagnetic Heisenberg model on a Sierpi\'nski gasket. The fractal porous feature generates new kinds of frustration to exhibit exotic quantum states. Using advanced tensor network techniques, we identify a quantum gapless-spin-liquid ground state in fractional spatial dimension. This fractal spin system also demonstrates nontrivial non-local properties. While the extremely short-range correlation causes a highly degenerate spin form factor, the entanglement in this fractal system suggests a long-range scaling behavior. We also study the dynamic structure factor and clearly identify the gapless excitation with a stable corner excitation emerged from the ground-state entanglement. Our results unambiguously point out multiple essential properties of this fractal spin system, and open a new route to explore spin liquid and frustrated magnetism.

公開日:2024-09-24
翻訳日:2024-11-09 15:57:56

# 直交性制約問題に対する高速ランダム化法

Faster Randomized Methods for Orthogonality Constrained Problems ( http://arxiv.org/abs/2106.12060v2 )

ライセンス: Link先を確認

Boris Shustin, Haim Avron,

(参考訳) 近年の文献では、データサイエンスや計算科学を通じて生じる様々な行列問題の解法を高速化するためのランダム化手法の使用が提唱されている。ランダム化を利用する一般的な戦略の1つは、問題のサイズを減らす方法として使うことである。しかし、この戦略に基づく手法は、いくつかのアプリケーションに十分な精度を欠いている。ランダム化プレコンディショニング(Randomized preconditioning)は、より高精度なランダム化手法である。乱数化プレコンディショニングの最大の課題は、根底にある反復的手法の必要性であり、そのため、これまでは回帰問題や線形システムにのみランダム化プレコンディショニングが適用されてきた。本稿では、乱数化前提条件の適用を、データサイエンスで広く普及している別の重要な問題、すなわち(一般化された)直交制約による最適化問題にどのように拡張するかを示す。我々は、リーマン最適化とリーマン事前条件の枠組みに基づく、支配的な正準相関の計算問題とフィッシャー線形判別分析問題に基づくアプローチを実証する。両問題に対して,プレコンディショニングが計算コストと漸近収束に及ぼす影響を評価し,本手法の有効性を実証的に示す。

Recent literature has advocated the use of randomized methods for accelerating the solution of various matrix problems arising throughout data science and computational science. One popular strategy for leveraging randomization is to use it as a way to reduce problem size. However, methods based on this strategy lack sufficient accuracy for some applications. Randomized preconditioning is another approach for leveraging randomization, which provides higher accuracy. The main challenge in using randomized preconditioning is the need for an underlying iterative method, thus randomized preconditioning so far have been applied almost exclusively to solving regression problems and linear systems. In this article, we show how to expand the application of randomized preconditioning to another important set of problems prevalent across data science: optimization problems with (generalized) orthogonality constraints. We demonstrate our approach, which is based on the framework of Riemannian optimization and Riemannian preconditioning, on the problem of computing the dominant canonical correlations and on the Fisher linear discriminant analysis problem. For both problems, we evaluate the effect of preconditioning on the computational costs and asymptotic convergence, and demonstrate empirically the utility of our approach.

公開日:2024-09-26
翻訳日:2024-11-09 15:57:56

# 深部線形ニューラルネットワークのロスランドスケープ--第2次分析

The loss landscape of deep linear neural networks: a second-order analysis ( http://arxiv.org/abs/2107.13289v1 )

ライセンス: Link先を確認

El Mehdi Achour, Fran\c{c}ois Malgouyres (IMT), S\'ebastien Gerchinovitz (IMT)

(参考訳) 正方形損失を伴う深部線形ニューラルネットワークの最適化環境について検討する。弱い仮定の下では、急激な局所ミニマは存在せず、局所的な極小マも存在しないことが知られている。しかし、一階アルゴリズムの力学において重要な役割を果たしうる非制限サドル点の存在と多様性は、わずかに研究されているだけである。最適化の展望を順2で完全に分析し、さらに一歩進める。我々は、すべての臨界点の中で、大域最小化点、厳格なサドル点、非制限サドル点を特徴づける。関連するすべての臨界値を列挙する。特徴付けは単純で、部分行列積のランクの条件を伴い、線形ニューラルネットワークを最適化する際に証明または観察された大域収束や暗黙の正則化にいくらか光を当てる。通過において、全大域最小化器の集合の明示的なパラメータ化を提供し、厳密で非制限的なサドル点の集合を示す。

We study the optimization landscape of deep linear neural networks with the square loss. It is known that, under weak assumptions, there are no spurious local minima and no local maxima. However, the existence and diversity of non-strict saddle points, which can play a role in first-order algorithms' dynamics, have only been lightly studied. We go a step further with a full analysis of the optimization landscape at order 2. We characterize, among all critical points, which are global minimizers, strict saddle points, and non-strict saddle points. We enumerate all the associated critical values. The characterization is simple, involves conditions on the ranks of partial matrix products, and sheds some light on global convergence or implicit regularization that have been proved or observed when optimizing linear neural networks. In passing, we provide an explicit parameterization of the set of all global minimizers and exhibit large sets of strict and non-strict saddle points.

公開日:2024-09-25
翻訳日:2024-11-09 15:57:56

# 深部線形ニューラルネットワークのロスランドスケープ:2次解析

The loss landscape of deep linear neural networks: a second-order analysis ( http://arxiv.org/abs/2107.13289v3 )

ライセンス: Link先を確認

El Mehdi Achour, François Malgouyres, Sébastien Gerchinovitz,

公開日:2024-09-25
翻訳日:2024-11-09 15:57:56

# LAViTeR:画像とキャプション生成による視覚・テキスト表現の学習

LAViTeR: Learning Aligned Visual and Textual Representations Assisted by Image and Caption Generation ( http://arxiv.org/abs/2109.04993v3 )

ライセンス: Link先を確認

Mohammad Abuzar Hashemi, Zhanghexuan Li, Mihir Chauhan, Yan Shen, Abhishek Satbhai, Mir Basheer Ali, Mingchen Gao, Sargur Srihari,

(参考訳) 大規模な画像テキストペアからの視覚的およびテキスト的表現の事前学習は、多くの下流視覚言語タスクの標準的アプローチになりつつある。トランスフォーマーベースのモデルは、自己教師付き学習タスクのリストを通じて、モーダル内およびモーダル内注意を学習する。本稿では,視覚およびテキスト表現学習のための新しいアーキテクチャであるLAViTeRを提案する。メインモジュールであるVisual Textual Alignment (VTA)は、GANベースの画像合成とイメージキャプションという2つの補助的なタスクによって支援される。また,学習した視覚とテキストの埋め込みの類似度を計測する新しい評価指標を提案する。 CUBとMS-COCOの2つの公開データセットによる実験結果から、関節機能埋め込み空間における視覚的およびテキスト的表現のアライメントが優れていることが示された。

Pre-training visual and textual representations from large-scale image-text pairs is becoming a standard approach for many downstream vision-language tasks. The transformer-based models learn inter and intra-modal attention through a list of self-supervised learning tasks. This paper proposes LAViTeR, a novel architecture for visual and textual representation learning. The main module, Visual Textual Alignment (VTA) will be assisted by two auxiliary tasks, GAN-based image synthesis and Image Captioning. We also propose a new evaluation metric measuring the similarity between the learnt visual and textual embedding. The experimental results on two public datasets, CUB and MS-COCO, demonstrate superior visual and textual representation alignment in the joint feature embedding space

公開日:2024-10-01
翻訳日:2024-11-09 15:57:56

# 画像属性編集のための高忠実GANインバージョン

High-Fidelity GAN Inversion for Image Attribute Editing ( http://arxiv.org/abs/2109.06590v4 )

ライセンス: Link先を確認

Tengfei Wang, Yong Zhang, Yanbo Fan, Jue Wang, Qifeng Chen,

(参考訳) 本稿では, 画像固有の細部(背景, 外観, 照明など)をよく保存した属性編集が可能な, GAN(High-fidelity Generative Adversarial Network)インバージョンフレームワークを提案する。まず、損失データ圧縮の観点から、高忠実度GAN逆変換の課題を解析する。低ビットレートの遅延符号では、再構成された画像や編集された画像の高忠実度の詳細を保存することは困難である。遅延コードのサイズを増やすことで、GAN変換の精度が向上するが、編集性は劣る。編集性を損なうことなく画像の忠実度を向上させるために,歪みマップを高忠実度再構成の基準として用いた歪みコンサルテーション手法を提案する。歪みコンサルテーションインバージョン (DCI) において、歪みマップは最初、高いレートの潜時写像に投影され、次に、基本的な低レート潜時符号を、より詳細なコンサルテーション融合によって補完する。高忠実度編集を実現するために,編集画像と反転画像のギャップを埋める自己教師付きトレーニングスキームを用いた適応歪みアライメント(ADA)モジュールを提案する。顔領域と車領域における大規模な実験は、インバージョンと編集品質の両方において明らかに改善されている。

We present a novel high-fidelity generative adversarial network (GAN) inversion framework that enables attribute editing with image-specific details well-preserved (e.g., background, appearance, and illumination). We first analyze the challenges of high-fidelity GAN inversion from the perspective of lossy data compression. With a low bit-rate latent code, previous works have difficulties in preserving high-fidelity details in reconstructed and edited images. Increasing the size of a latent code can improve the accuracy of GAN inversion but at the cost of inferior editability. To improve image fidelity without compromising editability, we propose a distortion consultation approach that employs a distortion map as a reference for high-fidelity reconstruction. In the distortion consultation inversion (DCI), the distortion map is first projected to a high-rate latent map, which then complements the basic low-rate latent code with more details via consultation fusion. To achieve high-fidelity editing, we propose an adaptive distortion alignment (ADA) module with a self-supervised training scheme, which bridges the gap between the edited and inversion images. Extensive experiments in the face and car domains show a clear improvement in both inversion and editing quality.

公開日:2024-09-27
翻訳日:2024-11-09 15:57:56

# ゴールデンデリケートアップルにおける酵素ブルーニング欠陥検出のための新しい簡易可視化アルゴリズム

A New Simple Vision Algorithm for Detecting the Enzymic Browning Defects in Golden Delicious Apples ( http://arxiv.org/abs/2110.03574v2 )

ライセンス: Link先を確認

Hamid Majidi Balanji,

(参考訳) 本研究は, 酵素的玄米処理によるゴールデンデリシスリンゴの表面欠陥を抽出し, 同定するために, 簡単な視覚アルゴリズムを設計, 実装した。実験では34種類のゴールデン・デリシアスリンゴが選択され、そのうち17個は酵素的染料欠陥があり、残りの17個は音が聞こえた。提案した視覚アルゴリズムの画像処理部は, リンゴの欠陥表面積を97.15%の精度で抽出した。分割画像の面積と平均は、2x1特徴ベクトルとして選択され、設計された人工ニューラルネットワークに入力される。以上の特徴から, 平均0.0065以下の画像は, 欠陥リンゴに属さないことが明らかとなった。本研究で適用されたニューラルネットワークの分類精度は99.19%であった。

In this work, a simple vision algorithm is designed and implemented to extract and identify the surface defects on the Golden Delicious apples caused by the enzymic browning process. 34 Golden Delicious apples were selected for the experiments, of which 17 had enzymic browning defects and the other 17 were sound. The image processing part of the proposed vision algorithm extracted the defective surface area of the apples with high accuracy of 97.15%. The area and mean of the segmented images were selected as the 2x1 feature vectors to feed into a designed artificial neural network. The analysis based on the above features indicated that the images with a mean less than 0.0065 did not belong to the defective apples; rather, they were extracted as part of the calyx and stem of the healthy apples. The classification accuracy of the neural network applied in this study was 99.19%

公開日:2024-09-22
翻訳日:2024-11-09 15:57:56

# 適応的関節分布学習

Adaptive joint distribution learning ( http://arxiv.org/abs/2110.04829v5 )

ライセンス: Link先を確認

Damir Filipovic, Michael Multerer, Paul Schneider,

(参考訳) テンソル積再生カーネルヒルベルト空間 (RKHS) を用いた共同確率分布推定のための新しいフレームワークを開発した。我々のフレームワークはRKHSモデルの本質的な制約を緩和し、最大数百万のサンプルサイズから推定するラドン-ニコディム誘導体の低次元、正規化、正のモデルに対応している。明確に定義された正規化条件分布と正条件分布は、我々のアプローチの自然な副産物である。提案手法は,予測から分類までの学習問題を高速に計算し,対応できる。理論的な結果は好意的な数値結果によって補われている。

We develop a new framework for estimating joint probability distributions using tensor product reproducing kernel Hilbert spaces (RKHS). Our framework accommodates a low-dimensional, normalized and positive model of a Radon--Nikodym derivative, which we estimate from sample sizes of up to several millions, alleviating the inherent limitations of RKHS modeling. Well-defined normalized and positive conditional distributions are natural by-products to our approach. Our proposal is fast to compute and accommodates learning problems ranging from prediction to classification. Our theoretical findings are supplemented by favorable numerical results.

公開日:2024-09-24
翻訳日:2024-11-09 15:57:56

# 科学者はどのようにしてオブザーバーに依存しない科学を確立することができるのか?

How can scientists establish an observer-independent science? Embodied cognition, consciousness and quantum mechanics ( http://arxiv.org/abs/2112.15428v3 )

ライセンス: Link先を確認

John Realpe-Gómez,

(参考訳) エビデンス(エビデンス)は、その行動と知覚が互いに一致して決定し、行動知覚ループを形成する、体現認知の理論のために成長している。これは、人間が何らかの形で知覚するものに参加することを示唆している。では、どのようにして科学者が行動知覚ループから逃れて、世界の観察者に依存しない説明を得ることができるのか? ここでは、心の哲学と科学と量子物理学のリバースエンジニアリングから得られる一連の予想を提示し、この問題を探求する。我々は、エンボディメントが伝統的に理解されているように、想像時間量子力学の側面を示すことができると論じる。次に、真にリアルタイムな量子力学の側面を得るのに必要な追加の制約について検討する。特に、実験を行う実施科学者は、認知を具現化するための従来のアプローチでは無視されている他の科学者の視点から説明されなければならないと推測し、観察者は、他の観察者が経験する対象と、他の観察対象を経験する「対象」の両方として補完的な役割を担わなければならない。

Evidence is growing for the theory of embodied cognition, which posits that action and perception co-determine each other, forming an action-perception loop. This suggests that we humans somehow participate in what we perceive. So, how can scientists escape the action-perception loop to obtain an observer-independent description of the world? Here we present a set of conjectures informed by the philosophy of mind and a reverse-engineering of science and quantum physics to explore this question. We argue that embodiment, as traditionally understood, can manifest aspects of imaginary-time quantum dynamics. We then explore what additional constraints are required to obtain aspects of genuine, real-time quantum dynamics. In particular, we conjecture that an embodied scientist doing experiments must be described from the perspective of another scientist, which is ignored in traditional approaches to embodied cognition, and that observers play complementary roles as both objects experienced by other observers and ``subjects'' that experience other objects.

公開日:2024-09-27
翻訳日:2024-11-09 15:57:56

# より高速なグラディエントバリアントを用いたプライバシー保護ロジスティック回帰トレーニング

Privacy-Preserving Logistic Regression Training with A Faster Gradient Variant ( http://arxiv.org/abs/2201.10838v9 )

ライセンス: Link先を確認

John Chiang,

(参考訳) 暗号化されたデータに対するロジスティック回帰のトレーニングは、セキュリティ上の問題に何年も取り組んできた。本稿では、プライバシー保護ロジスティック回帰トレーニングのための効率的な勾配変種である$quadratic$$gradient$を紹介する。我々は,Nesterov の Accelerated Gradient (NAG),Adaptive Gradient Algorithm (Adagrad) およびAdamアルゴリズムを2次勾配を組み込んで拡張し,これらの改良アルゴリズムを様々なデータセット上で評価する。実験により, 従来の1次勾配法と比較して, 改良アルゴリズムは収束速度を著しく向上することを示した。さらに,同相ロジスティック回帰学習の実装に改良NAG法を適用し,わずか4回の反復で同等の結果を得ることができた。二次勾配法は2階のニュートン・ラフソン法と1階の勾配勾配勾配/上昇アルゴリズムを統合することができ、幅広い数値最適化問題に適用できる可能性は高い。

Training logistic regression over encrypted data has been a compelling approach in addressing security concerns for several years. In this paper, we introduce an efficient gradient variant, called $quadratic$ $gradient$, for privacy-preserving logistic regression training. We enhance Nesterov's Accelerated Gradient (NAG), Adaptive Gradient Algorithm (Adagrad) and Adam algorithms by incorporating their quadratic gradients and evaluate these improved algorithms on various datasets. Experimental results demonstrate that the enhanced algorithms achieve significantly improved convergence speed compared to traditional first-order gradient methods. Moreover, we applied the enhanced NAG method to implement homomorphic logistic regression training, achieving comparable results within just 4 iterations. There is a good chance that the quadratic gradient approach could integrate first-order gradient descent/ascent algorithms with the second-order Newton-Raphson methods, and that it could be applied to a wide range of numerical optimization problems.

公開日:2024-09-22
翻訳日:2024-11-09 15:46:48

# ZXダイアグラムの微分積分と量子機械学習への応用

Differentiating and Integrating ZX Diagrams with Applications to Quantum Machine Learning ( http://arxiv.org/abs/2201.13250v7 )

ライセンス: Link先を確認

Quanlong Wang, Richie Yeung, Mark Koch,

(参考訳) ZX計算は、幅広い応用が成功した量子技術にとって有用なツールであることが証明されている。これらの応用のほとんどは代数的性質のものである。しかし、差別化と統合を含む他のタスクは、現在のZX技術では到達できないままである。ここでは、ZX-計算の枠組み内での微分と積分を実現することにより、ZXを解析的視点に高める。本稿では,バレンプラトーの解析に量子機械学習を応用し,ZX計算の新しい解析フレームワークを具体的に解説する。

ZX-calculus has proved to be a useful tool for quantum technology with a wide range of successful applications. Most of these applications are of an algebraic nature. However, other tasks that involve differentiation and integration remain unreachable with current ZX techniques. Here we elevate ZX to an analytical perspective by realising differentiation and integration entirely within the framework of ZX-calculus. We explicitly illustrate the new analytic framework of ZX-calculus by applying it in context of quantum machine learning for the analysis of barren plateaus.

公開日:2024-09-25
翻訳日:2024-11-09 15:46:48

# 低ビットレート映像理解のための符号化フレームワークとベンチマーク

A Coding Framework and Benchmark towards Low-Bitrate Video Understanding ( http://arxiv.org/abs/2202.02813v3 )

ライセンス: Link先を確認

Yuan Tian, Guo Lu, Yichao Yan, Guangtao Zhai, Li Chen, Zhiyong Gao,

(参考訳) ビデオ圧縮は、ほとんどのビデオ分析システムにとって不可欠である。転送帯域を節約しているにもかかわらず、特に低ビットレート設定では、下流のビデオ理解タスクも悪化する。この問題を体系的に検討するために,我々はまず,従来の手法,すなわちタスク分離,ラベルなし,データエマージされたセマンティクスという3つの原則が,マシンフレンドリーなコーディングフレームワークにとって重要であるが,今のところ完全に満足していないことを明らかにした。本稿では,従来のコーデックとニューラルネットワーク(NN)の両方を活用することによって,これらすべての原則を同時に満たす従来型ニューラル混合コーディングフレームワークを提案する。一方、従来のコーデックはビデオのピクセル信号を効率的に符号化できるが、意味情報を歪ませることもある。一方、高非線形NNは、ビデオセマンティクスをコンパクトな表現に凝縮するのに熟練している。このフレームワークは、自己管理された方法でラベルのないデータから自発的に学習されるコーディング手順に、動画の移動効率のよい意味表現が保存されることを保証することで最適化される。 2つのストリーム(コーデックとNN)から共同でデコードされたビデオは、リッチなセマンティクスを持ち、視覚的に写真リアリスティックであり、いくつかの主流のダウンストリームビデオ分析タスクのパフォーマンスを、後処理なしで実証的に向上させる。さらに,アテンション機構とアダプティブ・モデリング・スキームを導入することで,本手法の映像セマンティック・モデリング能力をさらに強化する。最後に、8つのデータセット上の3つの下流タスクを備えた低ビットレートビデオ理解ベンチマークを構築し、我々のアプローチの顕著な優位性を実証した。すべてのコード、データ、モデルは、 \url{https://github.com/tianyuan168326/VCS-Pytorch}で利用可能である。

Video compression is indispensable to most video analysis systems. Despite saving transportation bandwidth, it also deteriorates downstream video understanding tasks, especially at low-bitrate settings. To systematically investigate this problem, we first thoroughly review the previous methods, revealing that three principles, i.e., task-decoupled, label-free, and data-emerged semantic prior, are critical to a machine-friendly coding framework but are not fully satisfied so far. In this paper, we propose a traditional-neural mixed coding framework that simultaneously fulfills all these principles, by taking advantage of both traditional codecs and neural networks (NNs). On one hand, the traditional codecs can efficiently encode the pixel signal of videos but may distort the semantic information. On the other hand, highly non-linear NNs are proficient in condensing video semantics into a compact representation. The framework is optimized by ensuring that a transportation-efficient semantic representation of the video is preserved w.r.t. the coding procedure, which is spontaneously learned from unlabeled data in a self-supervised manner. The videos collaboratively decoded from two streams (codec and NN) are of rich semantics, as well as visually photo-realistic, empirically boosting several mainstream downstream video analysis task performances without any post-adaptation procedure. Furthermore, by introducing the attention mechanism and adaptive modeling scheme, the video semantic modeling ability of our approach is further enhanced. Finally, we build a low-bitrate video understanding benchmark with three downstream tasks on eight datasets, demonstrating the notable superiority of our approach. All codes, data, and models will be available at \url{https://github.com/tianyuan168326/VCS-Pytorch}.

公開日:2024-09-22
翻訳日:2024-11-09 15:46:48

# 逐次実験に対する実測的推論

Counterfactual inference for sequential experiments ( http://arxiv.org/abs/2202.06891v4 )

ライセンス: Link先を確認

Raaz Dwivedi, Katherine Tian, Sabina Tomkins, Predrag Klasnja, Susan Murphy, Devavrat Shah,

(参考訳) 複数の単位が時間とともに適応する処理ポリシーを用いて、複数の時間点に対する処理を割り当てるシーケンシャルな設計実験のアフタースタディ統計的推論を考察する。我々の目標は、最小限の可能な規模(各単位と各単位の異なる処理の下での平均結果)で、適応的な処理ポリシーに関する最小限の仮定で、カウンターファクト平均に対する推論保証を提供することです。反事実的手段に関する構造的な仮定がなければ、この課題は観測されたデータポイントよりも多くの未知のために実現不可能である。そこで本研究では,非線形混合効果モデルの非パラメトリック一般化と,先行研究で考慮された双線形潜在因子モデルの非パラメトリック一般化として機能する潜在因子モデルを提案する。推定には、近辺の変種である非パラメトリック法を用い、各単位と各時間に対する対実平均に対して非漸近的高確率誤差を定めている。正規性条件の下では、この境界は、単位数と時間点が適切な速度で一緒に$\infty$に増加するにつれて、反ファクトリアル平均に対する漸近的に妥当な信頼区間をもたらす。我々は,いくつかのシミュレーションと,モバイル医療臨床試験HeartStepsのデータを含むケーススタディを通して,我々の理論を解説する。

We consider after-study statistical inference for sequentially designed experiments wherein multiple units are assigned treatments for multiple time points using treatment policies that adapt over time. Our goal is to provide inference guarantees for the counterfactual mean at the smallest possible scale -- mean outcome under different treatments for each unit and each time -- with minimal assumptions on the adaptive treatment policy. Without any structural assumptions on the counterfactual means, this challenging task is infeasible due to more unknowns than observed data points. To make progress, we introduce a latent factor model over the counterfactual means that serves as a non-parametric generalization of the non-linear mixed effects model and the bilinear latent factor model considered in prior works. For estimation, we use a non-parametric method, namely a variant of nearest neighbors, and establish a non-asymptotic high probability error bound for the counterfactual mean for each unit and each time. Under regularity conditions, this bound leads to asymptotically valid confidence intervals for the counterfactual mean as the number of units and time points grows to $\infty$ together at suitable rates. We illustrate our theory via several simulations and a case study involving data from a mobile health clinical trial HeartSteps.

公開日:2024-09-22
翻訳日:2024-11-09 15:46:48

# フレキシブル匿名ネットワークを目指して

Towards Flexible Anonymous Networks ( http://arxiv.org/abs/2203.03764v4 )

ライセンス: Link先を確認

Florentin Rochet, Jules Dejaeghere, Tariq Elahi,

(参考訳) Torのような匿名通信設計は、様々なグローバルな場所でリレーを走らせる多くのボランティアに対して、分散信頼に基づくセキュリティを構築している。実際には、この分布はTorソフトウェアの多くのバージョンが共存する異種ネットワークにつながり、それぞれ異なるプロトコル機能を持つ。この異種性のため、Tor開発者はネットワークの拡張性を維持する戦略として、前方互換のプロトコル設計を採用する。この戦略は、Torソフトウェアの異なるバージョンが、発見不可能なエラーなしに相互作用することを保証することを目的としている。本研究は,プロトコルの基本的なセキュリティ問題として,前方互換性のあるプロトコルの考慮によって実現されるプロトコル寛容を論じる。私たちは、開発者にとって有益である一方で、プロトコルの寛容さは、過去15年間にTorに対する強力な攻撃を引き起こしている、と論じています。この問題に対処するために、Flexible Anonymous Network (FAN)を提案する。これはボランティアベースの分散ネットワークのための新しいソフトウェアアーキテクチャで、開発者がソフトウェアを継続的に進化させる能力を失うことなく、依存関係をプロトコル寛容からシフトさせる。我が家一実施のインスタンスを作成すること二そのオーバーヘッドを評価して、三今もなおTorに当てはまる重度の攻撃に対して防衛するためのFANの利益のいくつかを実験すること。

Anonymous Communication designs such as Tor build their security on distributed trust over many volunteers running relays in diverse global locations. In practice, this distribution leads to a heterogeneous network in which many versions of the Tor software co-exist, each with differing sets of protocol features. Because of this heterogeneity, Tor developers employ forward-compatible protocol design as a strategy to maintain network extensibility. This strategy aims to guarantee that different versions of the Tor software interact without unrecoverable errors. In this work, we cast protocol tolerance that is enabled by forward-compatible protocol considerations as a fundamental security issue. We argue that, while being beneficial for the developers, protocol tolerance has resulted in a number of strong attacks against Tor in the past fifteen years. To address this issue, we propose Flexible Anonymous Network (FAN), a new software architecture for volunteer-based distributed networks that shifts the dependence away from protocol tolerance without losing the ability for developers to ensure the continuous evolution of their software. We i) instantiate an implementation, ii) evaluate its overheads and, iii) experiment with several of FAN's benefits to defend against a severe attack still applicable to Tor today.

公開日:2024-09-23
翻訳日:2024-11-09 15:46:48

# 再帰的変分量子コンパイル

Recursive Variational Quantum Compiling ( http://arxiv.org/abs/2203.08514v2 )

ライセンス: Link先を確認

Stian Bilek, Kristian Wold,

(参考訳) 変分量子コンパイル(VQC)アルゴリズムは、深い量子回路を浅いパラメータ化アンサーゼで近似することを目的としており、NISQハードウェアにより適している。本稿では、再帰的変動量子コンパイル(RVQC)アルゴリズムと呼ばれるVQCの変種を提案する。既存のVQCアルゴリズムでは、コンパイル中に全回路をコヒーレントに実行する必要がある。ノイズの影響下では、十分に深いターゲット回路は通常のVQCではコンパイルが不可能となる。コンパイルはしばしば勾配に基づく量子古典的アプローチによって達成されるので、量子ノイズは最適化時にノイズの勾配として表され、収束が困難になる。一方、RVQCは、まずそれを$N$の短いサブ回路に分割し、一度に1つのサブ回路を評価することで、回路をコンパイルすることができる。その結果、RVQCを実装するために必要な回路深さは、ターゲット回路の深さではなく、サブ回路の深さに依存する。高い$N$を選択することで、個々のコンパイルを成功させるのに十分な浅いサブ回路が確保できる。 RVQCはIBM SantiagoデバイスのノイズモデルでVQCと比較され、ランダムに生成された5ビット回路を約1000深さでコンパイルすることを目的としていた。 VQCは500回の最適化で収束できなかった。一方、RVQCは、ターゲット回路を$N = 5$に分割する際に、合計500回のイテレーションで0.90 \pm 0.05$の忠実度に収束することができた。

Variational quantum compiling (VQC) algorithms aim to approximate deep quantum circuits with shallow parameterized ansatzes, making them more suitable for NISQ hardware. In this article a variant of VQC named the recursive variational quantum compiling (RVQC) algorithm is proposed. Existing VQC algorithms typically require coherently executing the full circuit during compilation. Under the influence of noise, sufficiently deep target circuits make compiling unfeasible using ordinary VQC. Since the compiling is often accomplished using a gradient-based quantum-classical approach, the quantum noise manifest as a noisy gradient during optimization, making convergence hard to obtain. On the other hand, RVQC can compile a circuit by first dividing it into $N$ shorter sub-circuits, then evaluate one sub-circuit at a time. As a result, the circuit depth required to implement RVQC is not dependent on the depth of the target circuit, but on the depth of the sub-circuits. Choosing a high enough $N$ thus ensures sufficiently shallow sub-circuit which can be successfully compiled individually. RVQC was compared with VQC on a noise model of the IBM Santiago device with the goal of compiling several randomly generated five-qubit circuits of approximately depth 1000. It was shown that VQC was not able to converge within 500 iterations of optimization. On the other hand, RVQC was able to converge to a fidelity of $0.90 \pm 0.05$ within a total of 500 iterations when splitting the target circuits into $N = 5$ parts.

公開日:2024-09-24
翻訳日:2024-11-09 15:46:48

# 汎用エージェント研究のためのサンドボックス環境

The Sandbox Environment for Generalizable Agent Research (SEGAR) ( http://arxiv.org/abs/2203.10351v2 )

ライセンス: Link先を確認

R Devon Hjelm, Bogdan Mazoure, Florian Golemo, Samira Ebrahimi Kahou, Pedro Braga, Felipe Frujeri, Mihai Jalobeanu, Andrey Kolobov,

(参考訳) 対話型環境における逐次意思決定タスクの一般化に関する研究の課題は、明らかに進歩を示すベンチマークを設計することである。目立った道のりはあったが、現在のベンチマークでは、適切な露出や根底にある要因の直感的な制御を提供しておらず、簡単に実装でき、カスタマイズ可能で、拡張可能でもなく、計算に費用がかかる。汎用エージェント研究のためのサンドボックス環境(SEGAR)を構築した。 SEGARは、一般化目的をタスク分布を指定することで容易に設計できるので、RLにおける一般化研究の容易さと説明責任を向上させる。本稿では、SEGARの概要と、SEGARがこれらの目標にどのように貢献するか、および、SEGARが答えられるいくつかの研究課題を実証する実験を紹介する。

A broad challenge of research on generalization for sequential decision-making tasks in interactive environments is designing benchmarks that clearly landmark progress. While there has been notable headway, current benchmarks either do not provide suitable exposure nor intuitive control of the underlying factors, are not easy-to-implement, customizable, or extensible, or are computationally expensive to run. We built the Sandbox Environment for Generalizable Agent Research (SEGAR) with all of these things in mind. SEGAR improves the ease and accountability of generalization research in RL, as generalization objectives can be easy designed by specifying task distributions, which in turns allows the researcher to measure the nature of the generalization objective. We present an overview of SEGAR and how it contributes to these goals, as well as experiments that demonstrate a few types of research questions SEGAR can help answer.

公開日:2024-09-26
翻訳日:2024-11-09 15:46:48

# テレポーテーションによる量子ルーティング

Quantum Routing with Teleportation ( http://arxiv.org/abs/2204.04185v2 )

ライセンス: Link先を確認

Dhruv Devulapalli, Eddie Schoute, Aniruddha Bapat, Andrew M. Childs, Alexey V. Gorshkov,

(参考訳) 量子系における相互作用制約下での量子ビットの任意の置換を任意に行うことで、高速な局所演算と古典的通信(LOCC)が可能な問題について検討する。特に,スワップベースおよびより一般的なユニタリルーティング手法による高速化の例として,絡み合いを分散し,LOCCを用いて量子テレポーテーションを行う例を示す。さらに,通信通信がスワップベースのルーティングよりも最悪のルーティング時間で対数的に高速化する相互作用グラフの例を述べる。また、量子テレポーテーションによって得られるスピードアップの限界(O(\sqrt{N \log N})$上界)について検討し、グラフの一般的なクラスに対してより厳密な境界を与える。

We study the problem of implementing arbitrary permutations of qubits under interaction constraints in quantum systems that allow for arbitrarily fast local operations and classical communication (LOCC). In particular, we show examples of speedups over swap-based and more general unitary routing methods by distributing entanglement and using LOCC to perform quantum teleportation. We further describe an example of an interaction graph for which teleportation gives a logarithmic speedup in the worst-case routing time over swap-based routing. We also study limits on the speedup afforded by quantum teleportation - showing an $O(\sqrt{N \log N})$ upper bound on the separation in routing time for any interaction graph - and give tighter bounds for some common classes of graphs.

公開日:2024-09-23
翻訳日:2024-11-09 15:46:48

# 生物学的時系列データから確率力学方程式を発見する

Discovering stochastic dynamical equations from biological time series data ( http://arxiv.org/abs/2205.02645v6 )

ライセンス: Link先を確認

Arshed Nabeel, Ashwin Karichannavar, Shuaib Palathingal, Jitesh Jhawar, David B. Brückner, Danny Raj M., Vishwesha Guttal,

(参考訳) 理論的研究により、確率性は反直観的な方法で生態系の力学に影響を与えることが示されている。しかし、個体群や生態系の動態を規定する方程式を知らずに、実際のデータセットにおける確率性の役割を確かめることは困難である。したがって、データセットから支配確率方程式を推定する逆問題は重要である。本稿では,状態変数の時系列データを入力とし,確率微分方程式を出力する方程式探索手法を提案する。確率計算からの従来のアプローチと方程式発見手法を組み合わせることでこれを実現できる。いくつかの応用を通して,本手法の一般化を実証する。まず、基本的に異なる支配方程式を持つ様々な確率モデルを意図的に選択するが、ほぼ同一の定常分布を生成する。時系列データのみの解析から,正しい基礎となる方程式を復元し,その安定性を正確に推定できることが示される。我々は,魚の学習と単一細胞移動という,時空間スケールとダイナミクスの異なる2つの実世界のデータセット上で,我々の手法を実証する。本手法の様々な限界と潜在的な落とし穴と診断方法による克服方法について述べる。最後に、PyDaDDy(Python Library for Data Driven Dynamics)というパッケージを通じて、オープンソースコードを提供しています。

Theoretical studies have shown that stochasticity can affect the dynamics of ecosystems in counter-intuitive ways. However, without knowing the equations governing the dynamics of populations or ecosystems, it is difficult to ascertain the role of stochasticity in real datasets. Therefore, the inverse problem of inferring the governing stochastic equations from datasets is important. Here, we present an equation discovery methodology that takes time series data of state variables as input and outputs a stochastic differential equation. We achieve this by combining traditional approaches from stochastic calculus with the equation-discovery techniques. We demonstrate the generality of the method via several applications. First, we deliberately choose various stochastic models with fundamentally different governing equations; yet they produce nearly identical steady-state distributions. We show that we can recover the correct underlying equations, and thus infer the structure of their stability, accurately from the analysis of time series data alone. We demonstrate our method on two real-world datasets -- fish schooling and single-cell migration -- which have vastly different spatiotemporal scales and dynamics. We illustrate various limitations and potential pitfalls of the method and how to overcome them via diagnostic measures. Finally, we provide our open-source codes via a package named PyDaDDy (Python library for Data Driven Dynamics).

公開日:2024-09-22
翻訳日:2024-11-09 15:46:48

# DQNは学ぶか?

Does DQN Learn? ( http://arxiv.org/abs/2205.13617v4 )

ライセンス: Link先を確認

Aditya Gopalan, Gugan Thoppe,

(参考訳) 強化学習法が有用であるためには、その限界で見積もるポリシーは、少なくとも平均的には、初期推定よりも優れている必要がある。本研究では,全ての可能な状態や動作を無限に見ることができても,広く使用されている深層Q-Network (DQN) が,この基本的な基準を満たさないことを示す(この条件により,表型Q-ラーニングの最適Q-値への収束が保証される)。私たちの作品のハイライトは以下のとおりです。第一に、DQNは一般的に、初期よりも悪い政策を生み出す非自明な確率を持つことを示す。第二に、線形DQNの文脈でこの振る舞いを理論的に説明し、ニューラルネットワークを線形関数近似に置き換えるが、DQNの他の重要な概念、例えば経験的リプレイ、ターゲットネットワーク、および$\epsilon$-greedy探索を保持する。我々の主な結果は、線形DQNの尾の挙動は、決定論的微分包含の不変集合、つまり微分方程式の集合値一般化によって支配されることである。特に、これらの不変集合は局所的最適ポリシーと整合する必要はないことを示し、DQNの準最適ポリシーへの収束や政策振動といった病理学的挙動を説明する。また、制限ポリシーが常に最悪であるシナリオも提供します。我々の研究は、関数近似と$\epsilon$-greedyの探索によるQ-ラーニングの振る舞いの理解における長年のギャップに対処する。

For a reinforcement learning method to be useful, the policy it estimates in the limit must be superior to the initial guess, at least on average. In this work, we show that the widely used Deep Q-Network (DQN) fails to meet even this basic criterion, even when it gets to see all possible states and actions infinitely often (a condition that ensures tabular Q-learning's convergence to the optimal Q-value). Our work's key highlights are as follows. First, we numerically show that DQN generally has a non-trivial probability of producing a policy worse than the initial one. Second, we give a theoretical explanation for this behavior in the context of linear DQN, wherein we replace the neural network with a linear function approximation but retain DQN's other key ideas, such as experience replay, target network, and $\epsilon$-greedy exploration. Our main result is that the tail behaviors of linear DQN are governed by invariant sets of a deterministic differential inclusion, a set-valued generalization of a differential equation. Notably, we show that these invariant sets need not align with locally optimal policies, thus explaining DQN's pathological behaviors, such as convergence to sub-optimal policies and policy oscillation. We also provide a scenario where the limiting policy is always the worst. Our work addresses a longstanding gap in understanding the behaviors of Q-learning with function approximation and $\epsilon$-greedy exploration.

公開日:2024-09-21
翻訳日:2024-11-09 15:46:48

# GraphMLP: 3Dヒューマンポース推定のためのグラフMLPライクなアーキテクチャ

GraphMLP: A Graph MLP-Like Architecture for 3D Human Pose Estimation ( http://arxiv.org/abs/2206.06420v5 )

ライセンス: Link先を確認

Wenhao Li, Mengyuan Liu, Hong Liu, Tianyu Guo, Ti Wang, Hao Tang, Nicu Sebe,

(参考訳) 現代の多層パーセプトロン(MLP)モデルは、自己注意なしで視覚表現を学習する際の競合的な結果を示している。しかし、既存のMLPモデルは、局所的な詳細を捉えるのが得意ではなく、人体構成に関する事前の知識が欠けているため、骨格表現学習のモデリング能力は制限されている。これらの課題に対処するため,我々は,3次元ポーズ推定のためのグローバル・ローカル・グラフィック統一アーキテクチャにおいて,MPPとGCNを組み合わせたグラフ強化型MLPアーキテクチャーGraphMLPを提案する。 GraphMLPは、人体のグラフ構造をMLPモデルに組み込んで、3D人間のポーズのドメイン固有の要求を満たすとともに、局所的およびグローバルな空間的相互作用を可能にする。さらに,GraphMLPをビデオ領域に柔軟かつ効率的に拡張し,複雑な時間的ダイナミクスを,列長が無視できる計算コストゲインの簡単な方法で効果的にモデル化できることを提案する。我々の知る限りでは、これは単一のフレームとビデオシーケンスで3次元のポーズ推定を行う最初のMLPライクなアーキテクチャである。大規模な実験により、提案したGraphMLPは、Human3.6MとMPI-INF-3DHPの2つのデータセットで最先端のパフォーマンスを達成することが示された。コードとモデルはhttps://github.com/Vegetebird/GraphMLP.comで公開されている。

Modern multi-layer perceptron (MLP) models have shown competitive results in learning visual representations without self-attention. However, existing MLP models are not good at capturing local details and lack prior knowledge of human body configurations, which limits their modeling power for skeletal representation learning. To address these issues, we propose a simple yet effective graph-reinforced MLP-Like architecture, named GraphMLP, that combines MLPs and graph convolutional networks (GCNs) in a global-local-graphical unified architecture for 3D human pose estimation. GraphMLP incorporates the graph structure of human bodies into an MLP model to meet the domain-specific demand of the 3D human pose, while allowing for both local and global spatial interactions. Furthermore, we propose to flexibly and efficiently extend the GraphMLP to the video domain and show that complex temporal dynamics can be effectively modeled in a simple way with negligible computational cost gains in the sequence length. To the best of our knowledge, this is the first MLP-Like architecture for 3D human pose estimation in a single frame and a video sequence. Extensive experiments show that the proposed GraphMLP achieves state-of-the-art performance on two datasets, i.e., Human3.6M and MPI-INF-3DHP. Code and models are available at https://github.com/Vegetebird/GraphMLP.

公開日:2024-09-21
翻訳日:2024-11-09 15:46:48

# 人間の目に触発されたリカレントニューラルネットワークは、敵の騒音に対してよりロバストである

Human Eyes Inspired Recurrent Neural Networks are More Robust Against Adversarial Noises ( http://arxiv.org/abs/2206.07282v2 )

ライセンス: Link先を確認

Minkyu Choi, Yizhen Zhang, Kuan Han, Xiaokai Wang, Zhongming Liu,

(参考訳) 人間は、静かな物体に焦点をあて、自明な詳細を無視して、視覚的な環境を積極的に観察する。しかし、畳み込みニューラルネットワーク(CNN)に基づくコンピュータビジョンモデルは、単一のフィードフォワードパスを通じて、視覚的な入力を一度に分析することが多い。本研究では、人間の脳にインスパイアされたデュアルストリーム視覚モデルを設計した。このモデルは網膜のような入力層を特徴とし、次の焦点(固定点)を決定する2つのストリームと、固定点を取り巻く視覚を解釈する2つのストリームを含む。このモデルは、画像認識に基づいて、様々な部分に焦点を当てる度に、一連の固定を通して画像を検査し、画像の表現を段階的に構築する。このモデルを,物体認識,視線行動,対向強靭性の観点から評価した。以上の結果から,本モデルは人間の注意を模倣する訓練を受けずに,人間と類似した形で観察し,網膜サンプリングや反復処理による敵の攻撃に対する堅牢性を高めることが可能であることが示唆された。特に、このモデルは、フィードフォワードのみのモデルとは切り離して、よりよく見ることによって、知覚上のエラーを修正することができる。結論として, 網膜サンプリング, 眼球運動, リカレントダイナミクスの相互作用は, 人間の視覚的探索や推論において重要である。

Humans actively observe the visual surroundings by focusing on salient objects and ignoring trivial details. However, computer vision models based on convolutional neural networks (CNN) often analyze visual input all at once through a single feed-forward pass. In this study, we designed a dual-stream vision model inspired by the human brain. This model features retina-like input layers and includes two streams: one determining the next point of focus (the fixation), while the other interprets the visuals surrounding the fixation. Trained on image recognition, this model examines an image through a sequence of fixations, each time focusing on different parts, thereby progressively building a representation of the image. We evaluated this model against various benchmarks in terms of object recognition, gaze behavior and adversarial robustness. Our findings suggest that the model can attend and gaze in ways similar to humans without being explicitly trained to mimic human attention, and that the model can enhance robustness against adversarial attacks due to its retinal sampling and recurrent processing. In particular, the model can correct its perceptual errors by taking more glances, setting itself apart from all feed-forward-only models. In conclusion, the interactions of retinal sampling, eye movement, and recurrent dynamics are important to human-like visual exploration and inference.

公開日:2024-09-26
翻訳日:2024-11-09 15:46:48

# トークンによる支払いシステム

Token-Based Payment Systems ( http://arxiv.org/abs/2207.07530v2 )

ライセンス: Link先を確認

Geoffrey Goodell,

(参考訳) 本稿では,デジタル決済システムにおけるトークンと分散台帳の役割について考察する。本稿では,トークンを用いたデジタル決済システムの簡単な分類法を提案し,分散台帳技術がデジタル決済システム全般をサポートする方法の異なるモデルに対処する。我々は、消費者プライバシ、トークン発行、システムオペレーターに対する説明責任の観点から理解したデジタル決済システムの健全な機能に関するガイダンスを提供する。

In this article, we consider the roles of tokens and distributed ledgers in digital payment systems. We present a brief taxonomy of digital payment systems that use tokens, and we address the different models for how distributed ledger technology can support digital payment systems in general. We offer guidance on the salient features of digital payment systems, which we comprehend in terms of consumer privacy, token issuance, and accountability for system operators.

公開日:2024-09-21
翻訳日:2024-11-09 15:46:48

# マルチロボットコーディネーションのための分散微分可能な動的ゲーム

Distributed Differentiable Dynamic Game for Multi-robot Coordination ( http://arxiv.org/abs/2207.08892v4 )

ライセンス: Link先を確認

Yizhi Zhou, Wanxin Jin, Xuan Wang,

(参考訳) 本稿では,マルチロボット協調における前方および逆問題の効率よく解決できる分散微分可能動的ゲーム(D3G)フレームワークを開発する。我々は,ロボットの動作が,他者の行動にも依存する自身のダイナミクスと目的によって決定される動的ゲームとして,マルチロボット協調を定式化する。前方問題では、D3Gは分散シューティングベースのナッシュソルバを開発することにより、全てのロボットが協調してゲームのナッシュ平衡を分散的に求めることを可能にする。ロボットが与えられた協調デモを模倣する目的(およびダイナミクス)パラメータを見つけ(学習)する逆問題において、D3Gは微分ポントリャーギンの最大原理に基づく微分解法を提案し、各ロボットがパラメータを分散的かつ協調的に更新できるようにする。タスク構成が異なる2種類のロボットを用いてD3Gをシミュレーションでテストする。その結果, 従来の手法と比較して, 前方および逆問題の解法におけるD3Gの有効性が示された。

This paper develops a Distributed Differentiable Dynamic Game (D3G) framework, which can efficiently solve the forward and inverse problems in multi-robot coordination. We formulate multi-robot coordination as a dynamic game, where the behavior of a robot is dictated by its own dynamics and objective that also depends on others' behavior. In the forward problem, D3G enables all robots collaboratively to seek the Nash equilibrium of the game in a distributed manner, by developing a distributed shooting-based Nash solver. In the inverse problem, where each robot aims to find (learn) its objective (and dynamics) parameters to mimic given coordination demonstrations, D3G proposes a differentiation solver based on Differential Pontryagin's Maximum Principle, which allows each robot to update its parameters in a distributed and coordinated manner. We test the D3G in simulation with two types of robots given different task configurations. The results demonstrate the effectiveness of D3G for solving both forward and inverse problems in comparison with existing methods.

公開日:2024-09-23
翻訳日:2024-11-09 15:46:48

# ベリー-ディポールの遷移における外在的および内在的非線形ホール効果

Extrinsic and Intrinsic Nonlinear Hall Effects across Berry-Dipole Transitions ( http://arxiv.org/abs/2208.02972v2 )

ライセンス: Link先を確認

Zheng-Yang Zhuang, Zhongbo Yan,

(参考訳) 3次元ホップ絶縁体(3-dimensional Hopf insulator)は、トポロジカル位相のクラスである。異なるホップ不変量を持つ2つの回転不変ホップ絶縁体相を分離する臨界点は、通常のディラック型やワイル型臨界点とは大きく異なり、量子化されたベリー双極子によって特徴付けられる。このようなベリー-双極子遷移に近く、弱ドーピング状態における外在的および内在的非線形ホール伝導率テンソルは、ドーピングレベルとバルクエネルギーギャップの比の2つの普遍関数によって特徴づけられ、遷移のホップ不変量の変化に直接比例する。我々の研究は、非線形ホール効果はベリー-双極子遷移全体にわたって一般的な量子化挙動を示し、非線形ホール効果とホップ不変量との対応性を確立することを示唆している。

Three-dimensional Hopf insulators are a class of topological phases beyond the tenfold-way classification. The critical point separating two rotation-invariant Hopf insulator phases with distinct Hopf invariants is quite different from the usual Dirac-type or Weyl-type critical points and uniquely characterized by a quantized Berry dipole. Close to such Berry-dipole transitions, we find that the extrinsic and intrinsic nonlinear Hall conductivity tensors in the weakly doped regime are characterized by two universal functions of the ratio between doping level and bulk energy gap, and are directly proportional to the change in Hopf invariant across the transition. Our work suggests that the nonlinear Hall effects display a general-sense quantized behavior across Berry-dipole transitions, establishing a correspondence between nonlinear Hall effects and Hopf invariant.

公開日:2024-09-27
翻訳日:2024-11-09 15:46:48

# ディープラーニングのためのラデマッハ複雑度に基づく一般化境界について

On Rademacher Complexity-based Generalization Bounds for Deep Learning ( http://arxiv.org/abs/2208.04284v3 )

ライセンス: Link先を確認

Lan V. Truong,

(参考訳) Rademacherの複雑性に基づくアプローチは、少数の画像のクラスを分類するために、畳み込みニューラルネットワーク(CNN)上の非空の一般化バウンダリを生成することができる。一般リプシッツ活性化関数に対する関数空間とCNNの間の高次元写像のための新しいタラグランド縮約補題の開発は重要な技術的貢献である。以上の結果から,ReLU,Leaky ReLU,Parametric Rectifier Linear Unit,Sigmoid,Tanhなどの特別なアクティベーション機能を持つCNNのネットワーク長に依存しないことがわかった。

We show that the Rademacher complexity-based approach can generate non-vacuous generalisation bounds on Convolutional Neural Networks (CNNs) for classifying a small number of classes of images. The development of new Talagrand's contraction lemmas for high-dimensional mappings between function spaces and CNNs for general Lipschitz activation functions is a key technical contribution. Our results show that the Rademacher complexity does not depend on the network length for CNNs with some special types of activation functions such as ReLU, Leaky ReLU, Parametric Rectifier Linear Unit, Sigmoid, and Tanh.

公開日:2024-09-27
翻訳日:2024-11-09 15:46:48

# 量子マルチパラメータ推定のためのギャップパーシステンス定理

The gap persistence theorem for quantum multiparameter estimation ( http://arxiv.org/abs/2208.07386v3 )

ライセンス: Link先を確認

Lorcán O. Conlon, Jun Suzuki, Ping Koy Lam, Syed M. Assad,

(参考訳) 量子距離論の1つの重要な側面は、複数のパラメータの同時推定によってのみ明らかである。対称対数微分 Cram\'er-Rao bound (SLDCRB) は、各パラメータの可換性を推定するための最適な測定値である場合、達成可能な精度を与える。最適測定が通勤しない場合、SLDCRBは必ずしも到達できない。この点において、ホレボ・クラム・ラオ境界(HCRB)は基本的役割を担い、量子状態の無限に多くのコピーを同時に測定できるとき、最終的な到達可能な精度を提供する。実用的な目的のために、長岡クラム・ラオ境界(NCRB)はより関係があり、個別に量子状態を測定することに制限される。これら3つの境界の間の相互作用は、プローブ状態の有限コピーの集合的測定によって、究極の気象学的精度がいかに早くアプローチできるかを定めている。まず2つのパラメータ推定を考慮し、HCRBがプローブ状態の1つのコピーで飽和できない場合、プローブ状態の有限個のコピーに対して飽和できないことを証明した。そこで本研究では, HCRB を物理的に動機づけたいくつかの問題に対して飽和させることは不可能であることを示す。パラメータの数を推定するためには,SLDCRBの到達可能性に必要かつ十分な条件を分離可能な測定で提供する。さらに、SLDCRBがプローブ状態の1つのコピーで到達できない場合、プローブ状態の有限個のコピーの集合的な測定では到達できないことを示す。これらの結果は、プローブ状態の有限個のコピーに対して、SLDCRBが到達可能であるために必要かつ十分な条件を提供する。これは、最近[P. Horodecki et al, Phys. Rev. X Quantum 3, 010101 (2022)] によって強調された5つの問題の1つを顕著に一般化する。

One key aspect of quantum metrology, measurement incompatibility, is evident only through the simultaneous estimation of multiple parameters. The symmetric logarithmic derivative Cram\'er-Rao bound (SLDCRB), gives the attainable precision, if the optimal measurements for estimating each individual parameter commute. When the optimal measurements do not commute, the SLDCRB is not necessarily attainable. In this regard, the Holevo Cram\'er-Rao bound (HCRB) plays a fundamental role, providing the ultimate attainable precisions when one allows simultaneous measurements on infinitely many copies of a quantum state. For practical purposes, the Nagaoka Cram\'er-Rao bound (NCRB) is more relevant, applying when restricted to measuring quantum states individually. The interplay between these three bounds dictates how rapidly the ultimate metrological precisions can be approached through collective measurements on finite copies of the probe state. We first consider two parameter estimation and prove that if the HCRB cannot be saturated with a single copy of the probe state, then it cannot be saturated for any finite number of copies of the probe state. With this, we show that it is impossible to saturate the HCRB for several physically motivated problems. For estimating any number of parameters, we provide necessary and sufficient conditions for the attainability of the SLDCRB with separable measurements. We further prove that if the SLDCRB cannot be reached with a single copy of the probe state, it cannot be reached with collective measurements on any finite number of copies of the probe state. These results together provide necessary and sufficient conditions for the attainability of the SLDCRB for any finite number of copies of the probe state. This solves a significant generalisation of one of the five problems recently highlighted by [P.Horodecki et al, Phys. Rev. X Quantum 3, 010101 (2022)].

公開日:2024-09-25
翻訳日:2024-11-09 15:46:48

# 数個の熱量子の分割による絡み合い成長

Entanglement growth via splitting of a few thermal quanta ( http://arxiv.org/abs/2208.07816v2 )

ライセンス: Link先を確認

Pradip Laha, Darren W. Moore, Radim Filip,

(参考訳) 量子分割は、アインシュタイン=ポドルスキー=ローゼン状態によって実証されたガウスの絡み合いの本質的な生成物であり、明らかに最も一般的に生じる絡み合いの形式である。一般に、これは高コヒーレントで低ノイズの外部駆動を持つ非線形過程の強い励起から生じる。対照的に、閉じ込められたイオンと超伝導回路における効率的な三線型過程を含む最近の実験は、数個の熱量子の分裂をテストするための相補的な可能性を開いた。このような小さな熱エネルギーによって刺激され、強い縮退したトリリニアカップリングは、蒸留可能な4次スクイージングの3dB以上で検出できる大量の非古典性を生成する。定常絡み合いは、トリリニアカップリングと平行に存在する第3モードへの頻繁なパッシブ線形カップリングによって生成される。この新しいエンタングルメントは、ガウスの近似の外にあるが、平均的な熱量子数によって驚くほど増大し、ガウスのエンタングルメントに欠落する。蒸留性スクイーズを用いて、非線形ボソニック系の新しい絡み合い機構に光を当てた。

Quanta splitting is an essential generator of Gaussian entanglement, exemplified by Einstein-Podolsky-Rosen states and apparently the most commonly occurring form of entanglement. In general, it results from the strong pumping of a nonlinear process with a highly coherent and low-noise external drive. In contrast, recent experiments involving efficient trilinear processes in trapped ions and superconducting circuits have opened the complementary possibility to test the splitting of a few thermal quanta. Stimulated by such small thermal energy, a strong degenerate trilinear coupling generates large amounts of nonclassicality, detectable by more than 3 dB of distillable quadrature squeezing. Substantial entanglement can be generated via frequent passive linear coupling to a third mode present in parallel with the trilinear coupling. This new form of entanglement, outside any Gaussian approximation, surprisingly grows with the mean number of split thermal quanta; a quality absent from Gaussian entanglement. Using distillable squeezing we shed light on this new entanglement mechanism for nonlinear bosonic systems.

公開日:2024-09-24
翻訳日:2024-11-09 15:46:48

# IDP-PGFE:物理誘導特徴抽出に基づく解釈可能な破壊予測器

IDP-PGFE: An Interpretable Disruption Predictor based on Physics-Guided Feature Extraction ( http://arxiv.org/abs/2208.13197v2 )

ライセンス: Link先を確認

Chengshuo Shen, Wei Zheng, Yonghua Ding, Xinkun Ai, Fengming Xue, Yu Zhong, Nengchao Wang, Li Gao, Zhipeng Chen, Zhoujun Yang, Zhongyong Chen, Yuan Pan, J-TEXT team,

(参考訳) ディスラプション予測は、特に機械学習(ML)ベースの手法において、近年急速に進歩している。予測器が特定の予測を行う理由を理解することは、将来のトカマク破壊予測器の予測精度と同じくらい重要である。ほとんどの破壊予測器の目的は、精度またはクロスマシン能力である。しかし、ディスラプション予測モデルが解釈可能であれば、特定のサンプルがディスラプション前駆体として分類される理由を知ることができる。これにより、入ってくる破壊のタイプを判断し、破壊のメカニズムについて洞察することが可能になる。本稿では,J-TEXT上での物理誘導特徴抽出(IDP-PGFE)に基づく解釈破壊予測器を設計する。物理誘導された特徴を抽出することにより、モデルの予測性能を効果的に向上する。解釈結果の妥当性を保証するためには,高性能モデルが必要である。 IDP-PGFEの解釈可能性の研究は、J-TEXT破壊の理解を提供し、一般に既存の破壊の理解と一致している。 IDP-PGFEは, J-TEXTにおける密度限界実験に向けて, 連続的に密度を増大させることにより, 破壊に応用されている。 PGFEの特徴の時間的進化は、ECRHの応用によって放射線による破壊が引き起こされ、破壊時の密度が低下することを示す。 RMPの適用は確かにJ-TEXTの密度限界を上昇させる。この解釈可能性の研究は、RMPがMHD不安定性だけでなく、密度限界破壊を遅らせる放射プロファイルにも影響を及ぼす密度限界破壊の物理的メカニズムの直観を導く。

Disruption prediction has made rapid progress in recent years, especially in machine learning (ML)-based methods. Understanding why a predictor makes a certain prediction can be as crucial as the prediction's accuracy for future tokamak disruption predictors. The purpose of most disruption predictors is accuracy or cross-machine capability. However, if a disruption prediction model can be interpreted, it can tell why certain samples are classified as disruption precursors. This allows us to tell the types of incoming disruption and gives us insight into the mechanism of disruption. This paper designs a disruption predictor called Interpretable Disruption Predictor based On Physics-guided feature extraction (IDP-PGFE) on J-TEXT. The prediction performance of the model is effectively improved by extracting physics-guided features. A high-performance model is required to ensure the validity of the interpretation results. The interpretability study of IDP-PGFE provides an understanding of J-TEXT disruption and is generally consistent with existing comprehension of disruption. IDP-PGFE has been applied to the disruption due to continuously increasing density towards density limit experiments on J-TEXT. The time evolution of the PGFE features contribution demonstrates that the application of ECRH triggers radiation-caused disruption, which lowers the density at disruption. While the application of RMP indeed raises the density limit in J-TEXT. The interpretability study guides intuition on the physical mechanisms of density limit disruption that RMPs affect not only the MHD instabilities but also the radiation profile, which delays density limit disruption.

公開日:2024-09-26
翻訳日:2024-11-09 15:46:48

# Mine yOur owN anatomy: Revising Medical Image Segmentation with Extremely Limited Labels (特集バイオサイバネティックスとバイオサイバネティックス)

Mine yOur owN Anatomy: Revisiting Medical Image Segmentation with Extremely Limited Labels ( http://arxiv.org/abs/2209.13476v6 )

ライセンス: Link先を確認

Chenyu You, Weicheng Dai, Fenglin Liu, Yifei Min, Nicha C. Dvornek, Xiaoxiao Li, David A. Clifton, Lawrence Staib, James S. Duncan,

(参考訳) 近年のコントラスト学習の研究は, 医療画像セグメンテーションの文脈において, ラベルの少ないことのみを生かして, 優れた成果を上げている。既存の方法は、主にインスタンスの識別と不変マッピングに焦点を当てている。 1) 尾性: 医療画像データは通常、暗黙の長い尾のクラス分布に従う。したがって、訓練ですべてのピクセルを盲目的に活用することは、データの不均衡を招き、パフォーマンスを悪化させる。(2)一貫性:セグメント化モデルが、異なる解剖学的特徴のクラス内変化のために有意義で一貫性のある解剖学的特徴を学習したかどうか、(3)多様性:データセット全体のスライス内相関は、著しく低い注意を払っている。これは、データセット自体を戦略的に利用し、異なる解剖学的視点から類似しているが異なるサンプルを発見するための、原則化されたアプローチを求める動機である。本稿では,Mine yOur owN Anatomy (MONA) と呼ばれる,半教師付き2次元医用画像セグメンテーションフレームワークを紹介する。まず、先行研究では、すべてのピクセルがモデルトレーニングに等しく重要であると論じており、これらだけでは、主に監視信号が欠如していることから、意味のある解剖学的特徴を定義することは不可能である、と実証的に観察している。より強力なデータ拡張と最も近い隣人を使って、不変性を学ぶための2つの簡単なソリューションを示します。第2に,医療画像の解剖学的特徴の集合体への分解を教師なしで行うことをモデルに促す目的の集合を構築した。最後に、我々は実験的かつ理論的に、異なるラベル付き設定で3つのベンチマークデータセットに対してMONAの有効性を実証し、異なるラベル付き半教師付き設定で新しい最先端を実現する。

Recent studies on contrastive learning have achieved remarkable performance solely by leveraging few labels in the context of medical image segmentation. Existing methods mainly focus on instance discrimination and invariant mapping. However, they face three common pitfalls: (1) tailness: medical image data usually follows an implicit long-tail class distribution. Blindly leveraging all pixels in training hence can lead to the data imbalance issues, and cause deteriorated performance; (2) consistency: it remains unclear whether a segmentation model has learned meaningful and yet consistent anatomical features due to the intra-class variations between different anatomical features; and (3) diversity: the intra-slice correlations within the entire dataset have received significantly less attention. This motivates us to seek a principled approach for strategically making use of the dataset itself to discover similar yet distinct samples from different anatomical views. In this paper, we introduce a novel semi-supervised 2D medical image segmentation framework termed Mine yOur owN Anatomy (MONA), and make three contributions. First, prior work argues that every pixel equally matters to the model training; we observe empirically that this alone is unlikely to define meaningful anatomical features, mainly due to lacking the supervision signal. We show two simple solutions towards learning invariances - through the use of stronger data augmentations and nearest neighbors. Second, we construct a set of objectives that encourage the model to be capable of decomposing medical images into a collection of anatomical features in an unsupervised manner. Lastly, we both empirically and theoretically, demonstrate the efficacy of our MONA on three benchmark datasets with different labeled settings, achieving new state-of-the-art under different labeled semi-supervised settings.

公開日:2024-09-22
翻訳日:2024-11-09 15:46:48

# FIRE:エッジコンピューティングマイグレーションのための障害適応型強化学習フレームワーク

FIRE: A Failure-Adaptive Reinforcement Learning Framework for Edge Computing Migrations ( http://arxiv.org/abs/2209.14399v3 )

ライセンス: Link先を確認

Marie Siew, Shikhar Sharma, Zekai Li, Kun Guo, Chao Xu, Tania Lorido-Botran, Tony Q. S. Quek, Carlee Joe-Wong,

(参考訳) エッジコンピューティングでは、ユーザのモビリティのために、ユーザのサービスプロファイルが移行される。強化学習(RL)フレームワークは、しばしばシミュレーションデータに基づいて訓練される。しかし、既存のRLフレームワークは時折サーバの障害を見落としており、これは、自律運転やリアルタイム障害検出のような遅延に敏感なアプリケーションに影響を与えている。それでも、過去のトレーニングデータで適切に表現されていないこれらの失敗(まれな出来事)は、データ駆動RLアルゴリズムに挑戦する。実世界のトレーニング用アプリケーションにおいて、故障頻度を調整することは現実的ではないため、エッジコンピューティングのディジタルツイン環境でRLポリシーをトレーニングすることで、まれな事象に適応するフレームワークであるFIREを導入する。 ImREは重要なサンプリングに基づくQ-ラーニングアルゴリズムであり、希少事象をその値関数への影響に比例してサンプリングする。 FIREは、個々のサービスプロファイルと共有サービスのプロファイルにまたがる遅延、マイグレーション、障害、バックアップの配置コストを考慮に入れている。我々はImREの有界性と最適性への収束性を証明する。次に、拡張性を高めるために、新しいQ-ラーニング(ImDQL)とアクタ評論家(ImACRE)バージョンを導入します。リスクトレランスの異なるユーザに対応するために、当社のフレームワークを拡張しています。トレース駆動実験により,障害発生時のバニラRLやグリーディベースラインと比較して,FIREがコストを削減できることが判明した。

In edge computing, users' service profiles are migrated due to user mobility. Reinforcement learning (RL) frameworks have been proposed to do so, often trained on simulated data. However, existing RL frameworks overlook occasional server failures, which although rare, impact latency-sensitive applications like autonomous driving and real-time obstacle detection. Nevertheless, these failures (rare events), being not adequately represented in historical training data, pose a challenge for data-driven RL algorithms. As it is impractical to adjust failure frequency in real-world applications for training, we introduce FIRE, a framework that adapts to rare events by training a RL policy in an edge computing digital twin environment. We propose ImRE, an importance sampling-based Q-learning algorithm, which samples rare events proportionally to their impact on the value function. FIRE considers delay, migration, failure, and backup placement costs across individual and shared service profiles. We prove ImRE's boundedness and convergence to optimality. Next, we introduce novel deep Q-learning (ImDQL) and actor critic (ImACRE) versions of our algorithm to enhance scalability. We extend our framework to accommodate users with varying risk tolerances. Through trace driven experiments, we show that FIRE reduces costs compared to vanilla RL and the greedy baseline in the event of failures.

公開日:2024-09-22
翻訳日:2024-11-09 15:35:37

# 分散強化学習におけるフィードバック分布の最適化

How Does Return Distribution in Distributional Reinforcement Learning Help Optimization? ( http://arxiv.org/abs/2209.14513v2 )

ライセンス: Link先を確認

Ke Sun, Bei Jiang, Linglong Kong,

(参考訳) 分散強化学習は、標準RLでの期待だけでなく、戻り分布全体を学習することに焦点を当てており、性能向上に顕著な成功を収めている。これらの進歩にもかかわらず、分布RL内の戻り分布の理解は依然として限られている。本研究では、ニューラルネットワークZ-Iteration~(Neural FZI)フレームワークにおいて、古典的RLにまたがる再帰分布知識を利用して、分布RLの最適化の利点を検討する。まず, 分布RLの分布損失は, 良好な滑らかさ特性を持ち, 最適化安定性を促進する傾向にある安定勾配を享受できることを実証する。さらに、戻り分布を分解することにより、分布RLの加速効果を明らかにする。分布RLは、各環境における勾配推定のばらつきによって、戻り分布近似が適切であれば好適に動作することを示す。厳密な実験は、分布RLの安定な最適化挙動とその加速効果を古典的RLと比較して検証する。本研究は,分布RLアルゴリズムの帰属分布が最適化にどう役立つかを明らかにする。

Distributional reinforcement learning, which focuses on learning the entire return distribution instead of only its expectation in standard RL, has demonstrated remarkable success in enhancing performance. Despite these advancements, our comprehension of how the return distribution within distributional RL still remains limited. In this study, we investigate the optimization advantages of distributional RL by utilizing its extra return distribution knowledge over classical RL within the Neural Fitted Z-Iteration~(Neural FZI) framework. To begin with, we demonstrate that the distribution loss of distributional RL has desirable smoothness characteristics and hence enjoys stable gradients, which is in line with its tendency to promote optimization stability. Furthermore, the acceleration effect of distributional RL is revealed by decomposing the return distribution. It shows that distributional RL can perform favorably if the return distribution approximation is appropriate, measured by the variance of gradient estimates in each environment. Rigorous experiments validate the stable optimization behaviors of distributional RL and its acceleration effects compared to classical RL. Our research findings illuminate how the return distribution in distributional RL algorithms helps the optimization.

公開日:2024-09-23
翻訳日:2024-11-09 15:35:37

# DICTDIS:改良NMTのための曖昧さを制限した辞書

DICTDIS: Dictionary Constrained Disambiguation for Improved NMT ( http://arxiv.org/abs/2210.06996v3 )

ライセンス: Link先を確認

Ayush Maheshwari, Preethi Jyothi, Ganesh Ramakrishnan,

(参考訳) ドメイン固有ニューラルマシン翻訳(NMT)システムは、多言語社会における多様なユーザ集合に情報をアクセスできるようにする可能性において、社会的に重要な存在である。このようなNMTシステムは、語彙的に制約され、ドメイン固有の辞書から引き出されることが望ましい。辞書は、単語の多文性のために、ソースワード/フレーズに対して複数の候補翻訳を提示することができる。次に、オンスはNMTモデル上で、文脈的に最も適切な候補を選択する。以前の作業ではこの問題をほとんど無視しており、ターゲット語やフレーズを単一の制約に置き換える単一の制約設定に重点を置いていた。本研究では辞書から派生した複数の候補翻訳の曖昧さを解消する語彙制約付きNMTシステムであるDictDisを提案する。我々は、複数の辞書候補とのトレーニングデータを増強し、複数の候補制約を暗黙的に調整することで、トレーニング中の曖昧さを積極的に促進する。我々は、規制、金融、工学を含む様々な分野において、英語・ヒンディー語・英語・ドイツ語文に関する広範な実験を通じて、DictDisの有用性を実証する。また、標準ベンチマークテストデータセットの比較も行う。語彙的に制約された非拘束NMTに対する既存のアプローチと比較して、制限されたコピーや曖昧さに関連するすべての領域に対する優れた性能を示し、また、いくつかの領域において最大2-3 BLEU点の周波数改善を得る。

Domain-specific neural machine translation (NMT) systems (e.g., in educational applications) are socially significant with the potential to help make information accessible to a diverse set of users in multilingual societies. It is desirable that such NMT systems be lexically constrained and draw from domain-specific dictionaries. Dictionaries could present multiple candidate translations for a source word/phrase due to the polysemous nature of words. The onus is then on the NMT model to choose the contextually most appropriate candidate. Prior work has largely ignored this problem and focused on the single candidate constraint setting wherein the target word or phrase is replaced by a single constraint. In this work we present DictDis, a lexically constrained NMT system that disambiguates between multiple candidate translations derived from dictionaries. We achieve this by augmenting training data with multiple dictionary candidates to actively encourage disambiguation during training by implicitly aligning multiple candidate constraints. We demonstrate the utility of DictDis via extensive experiments on English-Hindi and English-German sentences in a variety of domains including regulatory, finance, engineering. We also present comparisons on standard benchmark test datasets. In comparison with existing approaches for lexically constrained and unconstrained NMT, we demonstrate superior performance with respect to constraint copy and disambiguation related measures on all domains while also obtaining improved fluency of up to 2-3 BLEU points on some domains.

公開日:2024-09-27
翻訳日:2024-11-09 15:35:37

# 航空機エンジンブレードの知的欠陥検出のための超画素知覚グラフニューラルネットワーク

Superpixel perception graph neural network for intelligent defect detection of aero-engine blade ( http://arxiv.org/abs/2210.07539v2 )

ライセンス: Link先を確認

Hongbing Shang, Qixiu Yang, Chuang Sun, Xuefeng Chen, Ruqiang Yan,

(参考訳) エアロエンジンは航空機や他の宇宙船のコアコンポーネントである。高速回転翼は空気を吸って完全に燃焼することで力を提供し、様々な欠陥が必然的に発生し、航空エンジンの運転安全性を脅かす。そのため、このような複雑なシステムには定期的な検査が不可欠である。しかしながら、ボアスコープ検査を行う既存の技術は、労働集約的で、時間がかかり、経験に依存している。特徴抽出のための多段階グラフ畳み込みネットワーク(MSGCN)と領域提案のための超画素知覚領域提案ネットワーク(SPRPN)を用いて,この技術を知能で実現するために,新しい超画素知覚グラフニューラルネットワーク(SPGNN)を提案する。まず、複雑な不規則なテクスチャをキャプチャするために、画像は一連のパッチに変換され、グラフ表現を得る。次に、複数のGCNブロックからなるMSGCNがグラフ構造の特徴を抽出し、グラフレベルでグラフ情報処理を行う。最後に、SPRPNは、グラフ表現特徴とスーパーピクセル知覚特徴を融合させて知覚境界ボックスを生成する。そのため,提案したSPGNNは,SPGNNパイプライン全体のグラフレベルにおいて,常に特徴抽出と情報伝達を実装し,受容野の減少と情報損失を軽減する。 SPGNNの有効性を検証するため,3000枚の画像を用いたシミュレートされたブレードデータセットを構築した。アルミニウムのパブリックデータセットは、異なる方法のパフォーマンスを検証するためにも使われる。実験結果から,提案したSPGNNは最先端手法と比較して優れた性能を示した。

Aero-engine is the core component of aircraft and other spacecraft. The high-speed rotating blades provide power by sucking in air and fully combusting, and various defects will inevitably occur, threatening the operation safety of aero-engine. Therefore, regular inspections are essential for such a complex system. However, existing traditional technology which is borescope inspection is labor-intensive, time-consuming, and experience-dependent. To endow this technology with intelligence, a novel superpixel perception graph neural network (SPGNN) is proposed by utilizing a multi-stage graph convolutional network (MSGCN) for feature extraction and superpixel perception region proposal network (SPRPN) for region proposal. First, to capture complex and irregular textures, the images are transformed into a series of patches, to obtain their graph representations. Then, MSGCN composed of several GCN blocks extracts graph structure features and performs graph information processing at graph level. Last but not least, the SPRPN is proposed to generate perceptual bounding boxes by fusing graph representation features and superpixel perception features. Therefore, the proposed SPGNN always implements feature extraction and information transmission at the graph level in the whole SPGNN pipeline, to alleviate the reduction of receptive field and information loss. To verify the effectiveness of SPGNN, we construct a simulated blade dataset with 3000 images. A public aluminum dataset is also used to validate the performances of different methods. The experimental results demonstrate that the proposed SPGNN has superior performance compared with the state-of-the-art methods.

公開日:2024-09-22
翻訳日:2024-11-09 15:35:37

# ベイジアンニューラルネットワークのためのデータサブサンプリング

Data Subsampling for Bayesian Neural Networks ( http://arxiv.org/abs/2210.09141v2 )

ライセンス: Link先を確認

Eiji Kawasaki, Markus Holzmann, Lawrence Adu-Gyamfi,

(参考訳) Markov Chain Monte Carlo (MCMC)アルゴリズムは、ニューラルネットワークの後方サンプリングの困難に繋がる大規模なデータセットに対して、うまくスケールしない。本稿では,ベイジアン推論コンテキストにおけるバッチデータ(ミニバッチ)を用いて拡張性に対処する可能性を評価するアルゴリズムとして,Pentalty Bayesian Neural Networks - PBNNを提案する。 PBNNは、メトロポリス・ヘイスティングス・アルゴリズムの一般化の一環としてペナルティ項を組み込むことによって、他のナイーブ・サブサンプリング技術に固有のバイアスを回避する。既存のMCMCフレームワークとPBNNを統合することは容易であり、損失関数の分散は単に受け入れ確率を減少させるだけである。合成データとMNISTデータセットの代替サンプリング戦略を比較することで、PBNNは小さなミニバッチサイズであっても優れた予測性能が得られることを示した。 PBNNは,ミニバッチサイズの変化による予測分布のキャリブレーションを行い,予測過信を著しく低減する手法を提案する。

Markov Chain Monte Carlo (MCMC) algorithms do not scale well for large datasets leading to difficulties in Neural Network posterior sampling. In this paper, we propose Penalty Bayesian Neural Networks - PBNNs, as a new algorithm that allows the evaluation of the likelihood using subsampled batch data (mini-batches) in a Bayesian inference context towards addressing scalability. PBNN avoids the biases inherent in other naive subsampling techniques by incorporating a penalty term as part of a generalization of the Metropolis Hastings algorithm. We show that it is straightforward to integrate PBNN with existing MCMC frameworks, as the variance of the loss function merely reduces the acceptance probability. By comparing with alternative sampling strategies on both synthetic data and the MNIST dataset, we demonstrate that PBNN achieves good predictive performance even for small mini-batch sizes of data. We show that PBNN provides a novel approach for calibrating the predictive distribution by varying the mini-batch size, significantly reducing predictive overconfidence.

公開日:2024-09-23
翻訳日:2024-11-09 15:35:37

# 等変拡散モデルを用いた構造に基づく医薬品設計

Structure-based Drug Design with Equivariant Diffusion Models ( http://arxiv.org/abs/2210.13695v3 )

ライセンス: Link先を確認

Arne Schneuing, Charles Harris, Yuanqi Du, Kieran Didi, Arian Jamasb, Ilia Igashov, Weitao Du, Carla Gomes, Tom Blundell, Pietro Lio, Max Welling, Michael Bronstein, Bruno Correia,

(参考訳) SBDD(Structure-based drug design)は、タンパク質標的に高親和性と特異性に結合する小分子リガンドを設計することを目的としている。創発的SBDD法は、タンパク質標的と複雑な薬物の構造データを利用して、新しい薬物候補を提案する。これらのアプローチは通常、結合ポケットを使って1つの原子を自己回帰的に配置する。近年、拡散生成モデルの急増がこの領域に入り、自然リガンドの統計的性質をより忠実に捉えることを約束している。しかしながら、既存のほとんどの手法は、化合物のボトムアップ・デ・ノボ設計にのみ焦点をあてたり、タスク固有のモデルで他の薬物開発課題に取り組むことに焦点を当てている。後者は適切なデータセットのキュレーション、モデルの慎重なエンジニアリング、各タスクのスクラッチからのトレーニングを必要とする。ここでは,オフザシェルフ特性の最適化,明示的負の設計,着色による部分分子設計など,より広範な問題に対して,単一の事前学習拡散モデルを適用する方法を示す。本稿では,SBDDを3次元条件付き生成問題として定式化し,タンパク質ポケット上に条件付きリガンドを生成するSE(3)等価拡散モデルDiffSBDDを提案する。我々のサイリコ実験では、DiffSBDDが地上の真実データの統計を効果的に捉えていることが示されています。さらに、様々な計算量に応じて、生成した薬物候補を改善するために、追加の制約をどのように利用できるかを示す。これらの結果は, 拡散モデルが従来の手法よりも正確に構造データの複雑な分布を表現し, サンプリング戦略以外の設計目標や制約を組み込むことができるという仮定を支持している。

Structure-based drug design (SBDD) aims to design small-molecule ligands that bind with high affinity and specificity to pre-determined protein targets. Generative SBDD methods leverage structural data of drugs in complex with their protein targets to propose new drug candidates. These approaches typically place one atom at a time in an autoregressive fashion using the binding pocket as well as previously added ligand atoms as context in each step. Recently a surge of diffusion generative models has entered this domain which hold promise to capture the statistical properties of natural ligands more faithfully. However, most existing methods focus exclusively on bottom-up de novo design of compounds or tackle other drug development challenges with task-specific models. The latter requires curation of suitable datasets, careful engineering of the models and retraining from scratch for each task. Here we show how a single pre-trained diffusion model can be applied to a broader range of problems, such as off-the-shelf property optimization, explicit negative design, and partial molecular design with inpainting. We formulate SBDD as a 3D-conditional generation problem and present DiffSBDD, an SE(3)-equivariant diffusion model that generates novel ligands conditioned on protein pockets. Our in silico experiments demonstrate that DiffSBDD captures the statistics of the ground truth data effectively. Furthermore, we show how additional constraints can be used to improve the generated drug candidates according to a variety of computational metrics. These results support the assumption that diffusion models represent the complex distribution of structural data more accurately than previous methods, and are able to incorporate additional design objectives and constraints changing nothing but the sampling strategy.

公開日:2024-09-23
翻訳日:2024-11-09 15:35:37

# ソフトラベルプロトタイプを用いた事例から新しい課題を学習する

Learning New Tasks from a Few Examples with Soft-Label Prototypes ( http://arxiv.org/abs/2210.17437v4 )

ライセンス: Link先を確認

Avyav Kumar Singh, Ekaterina Shutova, Helen Yannakoudakis,

(参考訳) 既存のNLPにおける少数ショット学習へのアプローチは、大言語モデル(LLM)および/またはこれらを微調整して、アウト・オブ・ディストリビューションデータの一般化に頼っている。そこで本研究では,入力領域における異なるクラスの分布を総合的に把握するソフトラベルのプロトタイプ(SLP)に基づく,新しい数発学習手法を提案する。本稿では,NLP タスクをクラスごとのごく少数の例 (4, 8, 16) から学習することに集中し,本手法がパラメータ効率が高く,テスト済みタスクの大部分に対して優れた性能を達成できることを実験的に実証する。また,本手法は,より汎用的な学習環境,主にメタラーニングに組み込むことで,強力なベースラインに対して優れた性能が得られることを示す。

Existing approaches to few-shot learning in NLP rely on large language models (LLMs) and/or fine-tuning of these to generalise on out-of-distribution data. In this work, we propose a novel few-shot learning approach based on soft-label prototypes (SLPs) designed to collectively capture the distribution of different classes across the input domain space. We focus on learning previously unseen NLP tasks from very few examples (4, 8, 16) per class and experimentally demonstrate that our approach achieves superior performance on the majority of tested tasks in this data-lean setting while being highly parameter efficient. We also show that our few-shot adaptation method can be integrated into more generalised learning settings, primarily meta-learning, to yield superior performance against strong baselines.

公開日:2024-09-22
翻訳日:2024-11-09 15:35:37

# 複素逆温度平面における量子臨界性のシグナチャ

Signatures of quantum criticality in the complex inverse temperature plane ( http://arxiv.org/abs/2211.00813v2 )

ライセンス: Link先を確認

Yang Liu, Songtai Lv, Yang Yang, Haiyuan Zou,

(参考訳) 複素分割関数とフィッシャー零点の概念は、有限温度および実時間動的相転移に対する固有の統計メカニズムを提供する。我々はこれらの複雑化の効用を量子相転移に拡張する。線あるいは閉曲線上の異なるフィッシャー零点を正確に同定し、一次元横場イジングモデルに対する領域壁励起や制限中間子との対応を解明する。フィッシャー零点の交叉挙動は、励起エネルギースケールが定量的に決定される量子相転移付近の臨界性を示す魅力的な図である。さらに、テンソルネットワーク計算による結果を確認し、閉零曲線の破壊による分解中間子励起の明確な信号を示す。我々の結果は、量子相転移のためのフィッシャー零点の重要な特徴を明白に示し、量子臨界性を探るために新しい経路を開く。

Concepts of the complex partition functions and the Fisher zeros provide intrinsic statistical mechanisms for finite temperature and real time dynamical phase transitions. We extend the utility of these complexifications to quantum phase transitions. We exactly identify different Fisher zeros on lines or closed curves and elucidate their correspondence with domain-wall excitations or confined mesons for the one-dimensional transverse field Ising model. The crossover behavior of the Fisher zeros provides a fascinating picture for criticality near the quantum phase transition, where the excitation energy scales are quantitatively determined. We further confirm our results by tensor network calculations and demonstrate a clear signal of deconfined meson excitations from the disruption of the closed zero curves. Our results unambiguously show significant features of Fisher zeros for a quantum phase transition and open up a new route to explore quantum criticality.

公開日:2024-09-24
翻訳日:2024-11-09 15:35:37

# Solidago: モジュール型のコラボレーションスコーリングパイプライン

Solidago: A Modular Collaborative Scoring Pipeline ( http://arxiv.org/abs/2211.01179v3 )

ライセンス: Link先を確認

Lê Nguyên Hoang, Romain Beylerian, Bérangère Colbois, Julien Fageot, Louis Faucon, Aidan Jungo, Alain Le Noac'h, Adrien Matissart, Oscar Villemaud,

(参考訳) 本稿では,任意のユーザコミュニティが任意のエンティティを共同でスコアすることを可能にする,エンドツーエンドのモジュールパイプラインであるSolidagoを提案する。 Solidagoは6つのモジュールの分解を提案している。まず、プリトラストとピアツーピアのブーチを使用して、信頼スコアをユーザーに割り当てる。第2に、参加に基づいて、信頼スコアは、エンティティごとのユーザ当たりの投票権に変換される。第3に、各ユーザに対して、ユーザの評価データから嗜好モデルを学ぶ。第4に、ユーザーのモデルは同様の規模に置かれる。第5に、これらのモデルは安全に集約されます。 6番目は、人間が読めるグローバルスコアを得るために後処理される。また、新しい信頼伝播アルゴリズム、最先端スケーリングおよび集約ソリューションの適応を含む6つのモジュールのデフォルト実装も提案する。当社のパイプラインはオープンソースプラットフォームである Tournesol.app にデプロイされています。これにより、あらゆる種類のエンティティの協調的、効果的、スケーラブル、公正、解釈可能、セキュアなスコアリングのための魅力的な基盤を築きます。

This paper presents Solidago, an end-to-end modular pipeline to allow any community of users to collaboratively score any number of entities. Solidago proposes a six-module decomposition. First, it uses pretrust and peer-to-peer vouches to assign trust scores to users. Second, based on participation, trust scores are turned into voting rights per user per entity. Third, for each user, a preference model is learned from the user's evaluation data. Fourth, users' models are put on a similar scale. Fifth, these models are securely aggregated. Sixth, models are post-processed to yield human-readable global scores. We also propose default implementations of the six modules, including a novel trust propagation algorithm, and adaptations of state-of-the-art scaling and aggregation solutions. Our pipeline has been successfully deployed on the open-source platform tournesol.app. We thereby lay an appealing foundation for the collaborative, effective, scalable, fair, interpretable and secure scoring of any set of entities.

公開日:2024-09-25
翻訳日:2024-11-09 15:35:37

# 解釈型機械学習を用いたIctal-Interictal-Injull Continuumにおける脳波パターン分類における臨床成績の改善

Improving Clinician Performance in Classification of EEG Patterns on the Ictal-Interictal-Injury Continuum using Interpretable Machine Learning ( http://arxiv.org/abs/2211.05207v5 )

ライセンス: Link先を確認

Alina Jade Barnett, Zhicheng Guo, Jin Jing, Wendong Ge, Peter W. Kaplan, Wan Yee Kong, Ioannis Karakis, Aline Herlopian, Lakshman Arcot Jayagopal, Olga Taraschenko, Olga Selioutski, Gamaleldin Osman, Daniel Goldenholz, Cynthia Rudin, M. Brandon Westover,

(参考訳) 集中治療室(ICUs)では、重度の脳損傷を防ぐために、重度疾患のある患者は脳波(EEGs)で監視される。モニター可能な患者の数は、訓練された医師が脳波を読むために利用できることによって制限され、脳波の解釈は主観的であり、サーバ間の変動が難しくなる。脳波のための自動ディープラーニングシステムは、人間のバイアスを減らし、診断プロセスを加速する。しかし、ブラックボックスのディープラーニングモデルは信頼できない、トラブルシューティングが難しい、現実のアプリケーションでは説明責任が欠如しているため、臨床医による信頼と採用の欠如につながっている。これらの課題に対処するために、有害な脳波パターンの存在を予測するだけでなく、その決定に関する高品質なケースベース説明を提供する、解釈可能な新しいディープラーニングモデルを提案する。我々のモデルは解釈可能であることを制約されているにもかかわらず、対応するブラックボックスモデルよりも優れた性能を発揮する。学習した2次元埋め込み空間は、頭蓋内損傷連続体脳波パターンの構造に関する最初の大域的概要を提供する。我々のモデルがどのように決定に達したかを理解する能力は、臨床医が有害な脳活動の診断と治療をより正確に行うのに役立つだけでなく、臨床実践における機械学習モデルの信頼と採用を高めるのに役立つ。

In intensive care units (ICUs), critically ill patients are monitored with electroencephalograms (EEGs) to prevent serious brain injury. The number of patients who can be monitored is constrained by the availability of trained physicians to read EEGs, and EEG interpretation can be subjective and prone to inter-observer variability. Automated deep learning systems for EEG could reduce human bias and accelerate the diagnostic process. However, black box deep learning models are untrustworthy, difficult to troubleshoot, and lack accountability in real-world applications, leading to a lack of trust and adoption by clinicians. To address these challenges, we propose a novel interpretable deep learning model that not only predicts the presence of harmful brainwave patterns but also provides high-quality case-based explanations of its decisions. Our model performs better than the corresponding black box model, despite being constrained to be interpretable. The learned 2D embedded space provides the first global overview of the structure of ictal-interictal-injury continuum brainwave patterns. The ability to understand how our model arrived at its decisions will not only help clinicians to diagnose and treat harmful brain activities more accurately but also increase their trust and adoption of machine learning models in clinical practice; this could be an integral component of the ICU neurologists' standard workflow.

公開日:2024-09-25
翻訳日:2024-11-09 15:35:37

# スパースディープニューラルネットワークアーキテクチャのための適応的・安定的階層的学習手法

An Adaptive and Stability-Promoting Layerwise Training Approach for Sparse Deep Neural Network Architecture ( http://arxiv.org/abs/2211.06860v2 )

ライセンス: Link先を確認

C G Krishnanunni, Tan Bui-Thanh,

(参考訳) この研究は、与えられたトレーニングデータセットに対してうまく一般化するディープニューラルネットワーク(DNN)アーキテクチャを段階的に開発するための2段階適応フレームワークを提案する。第1段階では、新しいレイヤを毎回追加し、前のレイヤでパラメータを凍結することで独立してトレーニングする、レイヤワイズトレーニングアプローチが採用されている。我々は、多様体正則化、スパーシティ正則化、物理インフォームド項を用いることで、DNNに望ましい構造を課す。本稿では, 学習アルゴリズムの望ましい特性として, エプシロン・デルタ安定促進の概念を導入し, 多様体正規化を用いることで, エプシロン・デルタ安定促進アルゴリズムが得られることを示す。さらに,新たに加えた層をトレーニングするために必要な条件を導出し,トレーニング飽和問題について検討する。アルゴリズムの第2段(後処理)では、浅いネットワークのシーケンスを用いて、第1段で生成された残差から情報を抽出し、予測精度を向上させる。試行錯誤問題と分類問題に関する数値的研究により,提案手法が同一サイズの完全連結DNNより優れていることを示す。さらに、物理インフォームドニューラルネットワーク(PINN)に偏微分方程式を解くための適応型アーキテクチャ戦略を組み込むことにより、適応型PINNは標準のPINNよりも優れているだけでなく、証明可能な安定性を持つ解釈可能な隠蔽層を生成することを数値的に示す。また, 楕円偏微分方程式に支配される逆問題の解法として, アーキテクチャ設計戦略を適用した。

This work presents a two-stage adaptive framework for progressively developing deep neural network (DNN) architectures that generalize well for a given training data set. In the first stage, a layerwise training approach is adopted where a new layer is added each time and trained independently by freezing parameters in the previous layers. We impose desirable structures on the DNN by employing manifold regularization, sparsity regularization, and physics-informed terms. We introduce a epsilon-delta stability-promoting concept as a desirable property for a learning algorithm and show that employing manifold regularization yields a epsilon-delta stability-promoting algorithm. Further, we also derive the necessary conditions for the trainability of a newly added layer and investigate the training saturation problem. In the second stage of the algorithm (post-processing), a sequence of shallow networks is employed to extract information from the residual produced in the first stage, thereby improving the prediction accuracy. Numerical investigations on prototype regression and classification problems demonstrate that the proposed approach can outperform fully connected DNNs of the same size. Moreover, by equipping the physics-informed neural network (PINN) with the proposed adaptive architecture strategy to solve partial differential equations, we numerically show that adaptive PINNs not only are superior to standard PINNs but also produce interpretable hidden layers with provable stability. We also apply our architecture design strategy to solve inverse problems governed by elliptic partial differential equations.

公開日:2024-09-22
翻訳日:2024-11-09 15:35:37

# 量子コンピュータにおける振動構造の測定回数の最適化:座標と測定方法

Optimizing the number of measurements for vibrational structure on quantum computers: coordinates and measurement schemes ( http://arxiv.org/abs/2211.11615v2 )

ライセンス: Link先を確認

Marco Majland, Rasmus Berg Jensen, Mads Greisen Højlund, Nikolaj Thomas Zinner, Ove Christiansen,

(参考訳) 短期デバイスに対する実用的な量子優位性の実証を禁止している主な課題の1つは、基底状態エネルギーなどの関連する物理量の推定に過剰な測定オーバーヘッドがかかることである。しかし、分子の電子的構造と振動的構造に大きな違いがあるため、計算アンハーモニック、振動状態の資源要求をいかに減らすかという問題は、電子的構造よりも比較的未解明のままである。重要なことに、ボゾン交換関係、区別可能なヒルベルト空間、振動座標は、資源要求を最小化するために活用できる振動系の操作を可能にする。本研究では, 種々の3モード(6モード)分子の無調波, 振動状態の推定に必要な測定値に対する, 異なる座標系と測定方法の影響について検討する。従来の振動構造プログラムから立方体ハミルトニアンの自動構成に基づいて, 座標変換による測定回数の削減を図り, 最大7倍(2.5倍)の3倍(1.5倍)の平均値を示す。

One of the primary challenges prohibiting demonstrations of practical quantum advantages for near-term devices amounts to excessive measurement overheads for estimating relevant physical quantities such as ground state energies. However, with major differences between the electronic and vibrational structure of molecules, the question of how the resource requirements of computing anharmonic, vibrational states can be reduced remains relatively unexplored compared to its electronic counterpart. Importantly, bosonic commutation relations, distinguishable Hilbert spaces and vibrational coordinates allow manipulations of the vibrational system that can be exploited to minimize resource requirements. In this work, we investigate the impact of different coordinate systems and measurement schemes on the number of measurements needed to estimate anharmonic, vibrational states for a variety of three-mode (six-mode) molecules. We demonstrate an average of 3-fold (1.5-fold), with up to 7-fold (2.5-fold), reduction in the number of measurements required by employing appropriate coordinate transformations, based on an automized construction of qubit Hamiltonians from a conventional vibrational structure program.

公開日:2024-09-24
翻訳日:2024-11-09 15:35:37

# 太陽と空の下のビデオケースシャドウ検出

Video Instance Shadow Detection Under the Sun and Sky ( http://arxiv.org/abs/2211.12827v3 )

ライセンス: Link先を確認

Zhenghao Xing, Tianyu Wang, Xiaowei Hu, Haoran Wu, Chi-Wing Fu, Pheng-Ann Heng,

(参考訳) 写真編集や光方向推定などのアプリケーションに不可欠なインスタンスのシャドー検出は、シャドーインスタンス、オブジェクトインスタンス、およびそれらの関連性を予測する上で大きな進歩を遂げている。このタスクの動画への拡張は、様々なビデオデータに注釈を付けることや、協会内の隠蔽や一時的な消滅に起因する複雑さに対処することの課題を示す。これらの課題に対応するために、ラベル付き画像データとラベルなしビデオデータの両方を活用する半教師付きビデオインスタンスシャドウ検出フレームワークViShadowを紹介した。 ViShadowは2段階のトレーニングパイプラインを備えている。第1ステージはラベル付きイメージデータを利用して、クロスフレームペアリングのための対照的な学習を通じて、シャドーとオブジェクトインスタンスを識別する。第2段階ではラベルのないビデオが採用され、追跡能力を高めるために関連するサイクル一貫性の損失が組み込まれている。一時的な消失を管理し、追跡継続性を確保するための検索機構が導入された。ラベル付きトレーニングビデオとラベル付きテストビデオと、SOAP-VIDメトリックを含むSOBA-VIDデータセットを、VISDソリューションの定量的評価のために導入する。 ViShadowの有効性は、ビデオインペインティング、インスタンスクローン、シャドウ編集、テキストインストラクションされたシャドウオブジェクト操作など、様々なビデオレベルのアプリケーションを通じてさらに実証されている。

Instance shadow detection, crucial for applications such as photo editing and light direction estimation, has undergone significant advancements in predicting shadow instances, object instances, and their associations. The extension of this task to videos presents challenges in annotating diverse video data and addressing complexities arising from occlusion and temporary disappearances within associations. In response to these challenges, we introduce ViShadow, a semi-supervised video instance shadow detection framework that leverages both labeled image data and unlabeled video data for training. ViShadow features a two-stage training pipeline: the first stage, utilizing labeled image data, identifies shadow and object instances through contrastive learning for cross-frame pairing. The second stage employs unlabeled videos, incorporating an associated cycle consistency loss to enhance tracking ability. A retrieval mechanism is introduced to manage temporary disappearances, ensuring tracking continuity. The SOBA-VID dataset, comprising unlabeled training videos and labeled testing videos, along with the SOAP-VID metric, is introduced for the quantitative evaluation of VISD solutions. The effectiveness of ViShadow is further demonstrated through various video-level applications such as video inpainting, instance cloning, shadow editing, and text-instructed shadow-object manipulation.

公開日:2024-09-24
翻訳日:2024-11-09 15:35:37

# オンデバイストレーニング: 既存のシステムに関する最初の概要

On-device Training: A First Overview on Existing Systems ( http://arxiv.org/abs/2212.00824v3 )

ライセンス: Link先を確認

Shuai Zhu, Thiemo Voigt, JeongGil Ko, Fatemeh Rahimian,

(参考訳) 機械学習(ML)とディープラーニング(DL)の最近のブレークスルーは、幅広いアプリケーションドメインにまたがる様々なインテリジェントシステムの設計と開発を触媒している。既存の機械学習モデルは、大きなメモリと計算能力を必要とするが、リソースに制約のあるデバイスにも、いくつかのモデルをデプロイする努力が続けられている。初期のアプリケーションシステムの大半はMLとDLモデルの推論機能を活用することに重点を置いており、さまざまなモバイルおよび組み込みセンシングコンポーネントから取得したデータは、分類やセグメンテーションといったアプリケーション目標のためにこれらのモデルを通して処理される。最近では、ML/DLモデルトレーニングにモバイルおよび組み込みコンピューティングリソースを活用するという概念が注目されている。 (i)無線リンクを介してデータを共有することなく、ローカルデータを介してモデルのトレーニングを行うことにより、設計によるプライバシ保護計算を可能にする。二モデルパーソナライズ及び環境適応、及び (二)インターネット接続を安定させることなく、遠隔かつアクセスし難い場所に正確なモデルを配置すること。この研究は、デバイス上でのモデルトレーニングを可能にする最先端のシステム研究の要約と分析を目標とし、システムの観点からデバイス上でのトレーニングに関する調査を提供する。

The recent breakthroughs in machine learning (ML) and deep learning (DL) have catalyzed the design and development of various intelligent systems over wide application domains. While most existing machine learning models require large memory and computing power, efforts have been made to deploy some models on resource-constrained devices as well. A majority of the early application systems focused on exploiting the inference capabilities of ML and DL models, where data captured from different mobile and embedded sensing components are processed through these models for application goals such as classification and segmentation. More recently, the concept of exploiting the mobile and embedded computing resources for ML/DL model training has gained attention, as such capabilities allow (i) the training of models via local data without the need to share data over wireless links, thus enabling privacy-preserving computation by design, (ii) model personalization and environment adaptation, and (ii) deployment of accurate models in remote and hardly accessible locations without stable internet connectivity. This work targets to summarize and analyze state-of-the-art systems research that allows such on-device model training capabilities and provide a survey of on-device training from a systems perspective.

公開日:2024-09-23
翻訳日:2024-11-09 15:35:37

# モバイルアプリケーションにおけるAI技術に関する実証的研究

An Empirical Study of AI Techniques in Mobile Applications ( http://arxiv.org/abs/2212.01635v3 )

ライセンス: Link先を確認

Yinghua Li, Xueqi Dang, Haoye Tian, Tiezhu Sun, Zhijie Wang, Lei Ma, Jacques Klein, Tegawendé F. Bissyandé,

(参考訳) モバイルアプリケーションへの人工知能(AI)の統合は、さまざまなドメインを大きく変え、ユーザエクスペリエンスを高め、高度な機械学習(ML)とディープラーニング(DL)技術を通じてパーソナライズされたサービスを提供する。 AI駆動のモバイルアプリは通常、ML/DL技術を活用して画像認識や自然言語処理などの重要なタスクを実行するアプリケーションを指す。本稿では、デバイス上でのMLアプリ、デバイス上でのDLアプリ、AIサービスをサポートする(クラウドベースの)アプリなど、AIアプリケーションに関する最も広範な実証的研究を行った。私たちの研究は、56,682の現実世界のAIアプリケーションを含み、3つの重要な視点に焦点を当てている。 1)AIアプリの人気を分析し、AIアプリの更新状況を調査するアプリケーション分析。 2)AIフレームワークの使用状況とAIモデル保護を分析するフレームワークとモデル分析。 3)ユーザプライバシ保護とユーザレビューの態度を検討するユーザ分析を行った。私たちの研究は、AIアプリ開発者、ユーザ、AI R\&Dに強く影響しています。ひとつは、モバイルアプリケーションにおけるAI統合の増加傾向に注目し、さまざまなAIフレームワークやモデルが広く採用されていることを示しています。一方,アプリセキュリティを強化するために,堅牢なモデル保護の必要性が指摘されている。さらに、ユーザプライバシの重要性を強調し、現在のAIアプリで使用されているAIテクノロジに対するユーザの態度を示す。私たちは、モバイルアプリケーションで使用されるAIテクノロジに関する将来の研究のためのオープンソースリソースとして、AIアプリデータセット(現在、最も広範なAIアプリデータセット)を提供しています。

The integration of artificial intelligence (AI) into mobile applications has significantly transformed various domains, enhancing user experiences and providing personalized services through advanced machine learning (ML) and deep learning (DL) technologies. AI-driven mobile apps typically refer to applications that leverage ML/DL technologies to perform key tasks such as image recognition and natural language processing. In this paper, we conducted the most extensive empirical study on AI applications, exploring on-device ML apps, on-device DL apps, and AI service-supported (cloud-based) apps. Our study encompasses 56,682 real-world AI applications, focusing on three crucial perspectives: 1) Application analysis, where we analyze the popularity of AI apps and investigate the update states of AI apps; 2) Framework and model analysis, where we analyze AI framework usage and AI model protection; 3) User analysis, where we examine user privacy protection and user review attitudes. Our study has strong implications for AI app developers, users, and AI R\&D. On one hand, our findings highlight the growing trend of AI integration in mobile applications, demonstrating the widespread adoption of various AI frameworks and models. On the other hand, our findings emphasize the need for robust model protection to enhance app security. Additionally, our study highlights the importance of user privacy and presents user attitudes towards the AI technologies utilized in current AI apps. We provide our AI app dataset (currently the most extensive AI app dataset) as an open-source resource for future research on AI technologies utilized in mobile applications.

公開日:2024-09-27
翻訳日:2024-11-09 15:35:37

# CURO:相対的オーバージェネレーションのためのカリキュラム学習

CURO: Curriculum Learning for Relative Overgeneralization ( http://arxiv.org/abs/2212.02733v3 )

ライセンス: Link先を確認

Lin Shi, Qiyuan Liu, Bei Peng,

(参考訳) 相対的過一般化(英: Relative Over generalization, RO)は、最適関節作用の効用が準最適関節作用の効用より下降した場合に、協調的マルチエージェントタスクで生じる病理である。 ROは、エージェントを局所的な最適状態に陥れさせるか、あるいは特定の時間内にエージェント間の重要な調整を必要とする協調的なタスクを解くのに失敗する。本研究では、マルチエージェント強化学習(MARL)において、値ベースアルゴリズムとポリシー勾配アルゴリズムの両方がROに悩まされ、効果的なコーディネーションポリシーを学習できないことを実証的に見出した。 ROを克服するために,相対的オーバージェネリゼーション(CURO)のためのカリキュラム学習という新しい手法を提案する。強力なROを示すターゲットタスクを解決するため,CUROではまず目標タスクの報酬関数を微調整し,エージェントを訓練するためのソースタスクを生成する。そこで我々は,あるタスクにおいて得られた知識を効率よく次のタスクに転送するために,値関数転送とバッファ転送を組み合わせた伝達学習手法を用いて,目的タスクのより効率的な探索を可能にする。 CUROは一般的に、値ベースおよびポリシー勾配MARL法の両方に適用できる。 QMIX, HAPPO, HATRPOに適用した場合, CUROは重大ROを克服し, 性能を向上し, 多様な協調型マルチエージェントタスクにおいて, ベースライン法より優れていることを示す。

Relative overgeneralization (RO) is a pathology that can arise in cooperative multi-agent tasks when the optimal joint action's utility falls below that of a sub-optimal joint action. RO can cause the agents to get stuck into local optima or fail to solve cooperative tasks requiring significant coordination between agents within a given timestep. In this work, we empirically find that, in multi-agent reinforcement learning (MARL), both value-based and policy gradient MARL algorithms can suffer from RO and fail to learn effective coordination policies. To better overcome RO, we propose a novel approach called curriculum learning for relative overgeneralization (CURO). To solve a target task that exhibits strong RO, in CURO, we first fine-tune the reward function of the target task to generate source tasks to train the agent. Then, to effectively transfer the knowledge acquired in one task to the next, we use a transfer learning method that combines value function transfer with buffer transfer, which enables more efficient exploration in the target task. CURO is general and can be applied to both value-based and policy gradient MARL methods. We demonstrate that, when applied to QMIX, HAPPO, and HATRPO, CURO can successfully overcome severe RO, achieve improved performance, and outperform baseline methods in a variety of challenging cooperative multi-agent tasks.

公開日:2024-09-23
翻訳日:2024-11-09 15:35:37

# テンソル分解によるグラフニューラルネットワークの効率的な関係認識近傍集約

Efficient Relation-aware Neighborhood Aggregation in Graph Neural Networks via Tensor Decomposition ( http://arxiv.org/abs/2212.05581v4 )

ライセンス: Link先を確認

Peyman Baghershahi, Reshad Hosseini, Hadi Moradi,

(参考訳) 知識グラフ埋め込み(KGE)の課題に取り組むために,多数のグラフニューラルネットワーク(GNN)が開発された。しかし、これらのアプローチの多くは、関係情報の重要な役割を見落とし、エンティティ情報と不十分に統合し、表現力は低下する。本稿では,リレーショナルグラフ畳み込みネットワーク(R-GCN)の集約関数にテンソル分解を組み込んだ新しい知識グラフエンコーダを提案する。我々のモデルは、関係型によって定義される低ランクテンソルの射影行列を用いて、隣り合う実体の表現を強化する。このアプローチはマルチタスク学習を容易にし、関係認識表現を生成する。さらに、CP分解によるコアテンソルの低ランク推定手法を導入し、モデルを効果的に圧縮・正規化する。コントラスト学習にインスパイアされたトレーニング戦略を採用し,グラフ処理に固有の1-N法のトレーニング制限を緩和する。私たちはFB15k-237とWN18RRという2つの一般的なベンチマークデータセットにおいて、エンティティとリレーションのために低次元の埋め込みを使用しながら、競合のすべてを上回っました。

Numerous Graph Neural Networks (GNNs) have been developed to tackle the challenge of Knowledge Graph Embedding (KGE). However, many of these approaches overlook the crucial role of relation information and inadequately integrate it with entity information, resulting in diminished expressive power. In this paper, we propose a novel knowledge graph encoder that incorporates tensor decomposition within the aggregation function of Relational Graph Convolutional Network (R-GCN). Our model enhances the representation of neighboring entities by employing projection matrices of a low-rank tensor defined by relation types. This approach facilitates multi-task learning, thereby generating relation-aware representations. Furthermore, we introduce a low-rank estimation technique for the core tensor through CP decomposition, which effectively compresses and regularizes our model. We adopt a training strategy inspired by contrastive learning, which relieves the training limitation of the 1-N method inherent in handling vast graphs. We outperformed all our competitors on two common benchmark datasets, FB15k-237 and WN18RR, while using low-dimensional embeddings for entities and relations.

公開日:2024-09-21
翻訳日:2024-11-09 15:35:37

# Z-SSMNet : Bi-parametric MRIによる前立腺癌検出と診断のためのゾーナル・アウェア自己監督メッシュネットワーク

Z-SSMNet: Zonal-aware Self-supervised Mesh Network for Prostate Cancer Detection and Diagnosis with Bi-parametric MRI ( http://arxiv.org/abs/2212.05808v2 )

ライセンス: Link先を確認

Yuan Yuan, Euijoon Ahn, Dagan Feng, Mohamad Khadra, Jinman Kim,

(参考訳) 臨床的に有意な前立腺癌(csPCa)の検出と診断において,bi-parametric magnetic resonance imaging (bpMRI)が重要なモダリティとなっている。 bpMRIを用いてcsPCaを識別するAIベースのシステムを開発することで、効率性とコスト効率を向上させることにより、PCa管理を変革することができる。しかし、畳み込みニューラルネットワーク(CNN)を用いた現在の最先端手法は、異方性画像から平面内および三次元空間情報を学習する際に限られている。それらのパフォーマンスは、大きく、多様で、よく注釈付けされたbpMRIデータセットの可用性にも依存する。本研究では,多次元(2D/2.5D/3D)畳み込みを適応的に統合し,高密度なスライス情報と異方性bpMRIのスライス間情報をバランスよく学習するZ-SSMNetを提案する。 bpMRIの外観,テクスチャ,構造を学習するために,大規模未ラベルデータを用いてネットワークを事前学習するための自己教師付き学習(SSL)手法を提案する。トレーニング前の段階で、スライス内情報とスライス間情報の両方をキャプチャすることを目的としている。さらに,我々は,csPCaの検出・診断能力をさらに向上するため,粒子解剖学的領域に集中するようにネットワークを拘束した。 10000以上のマルチセンターデータとマルチスキャナデータからなるPI-CAIデータセットについて広範な実験を行った。 Z-SSMNetは病変レベルの診断(APスコア0.633)と患者レベルの診断(AUROCスコア0.881)の両方に優れ,PI-CAIチャレンジのオープン開発フェーズにおけるトップ位置を確保し,APスコア0.690とAUROCスコア0.909を達成し,クローズドテストフェーズにおける第2位の地位を確保した。

Bi-parametric magnetic resonance imaging (bpMRI) has become a pivotal modality in the detection and diagnosis of clinically significant prostate cancer (csPCa). Developing AI-based systems to identify csPCa using bpMRI can transform PCa management by improving efficiency and cost-effectiveness. However, current state-of-the-art methods using convolutional neural networks (CNNs) are limited in learning in-plane and three-dimensional spatial information from anisotropic images. Their performances also depend on the availability of large, diverse, and well-annotated bpMRI datasets. We propose a Zonal-aware Self-supervised Mesh Network (Z-SSMNet) that adaptively integrates multi-dimensional (2D/2.5D/3D) convolutions to learn dense intra-slice information and sparse inter-slice information of the anisotropic bpMRI in a balanced manner. A self-supervised learning (SSL) technique is proposed to pre-train our network using large-scale unlabeled data to learn the appearance, texture, and structure semantics of bpMRI. It aims to capture both intra-slice and inter-slice information during the pre-training stage. Furthermore, we constrained our network to focus on the zonal anatomical regions to further improve the detection and diagnosis capability of csPCa. We conducted extensive experiments on the PI-CAI dataset comprising 10000+ multi-center and multi-scanner data. Our Z-SSMNet excelled in both lesion-level detection (AP score of 0.633) and patient-level diagnosis (AUROC score of 0.881), securing the top position in the Open Development Phase of the PI-CAI challenge and maintained strong performance, achieving an AP score of 0.690 and an AUROC score of 0.909, and securing the second-place ranking in the Closed Testing Phase.

公開日:2024-09-22
翻訳日:2024-11-09 15:35:37

# 大規模言語モデルにおけるグラフ学習とその発展

Graph Learning and Its Advancements on Large Language Models: A Holistic Survey ( http://arxiv.org/abs/2212.08966v5 )

ライセンス: Link先を確認

Shaopeng Wei, Jun Wang, Yu Zhao, Xingyan Chen, Qing Li, Fuzhen Zhuang, Ji Liu, Fuji Ren, Gang Kou,

(参考訳) グラフ学習は、ノード間の複雑な関係とグラフのトポロジ的構造を学習する試みである。長年にわたり、グラフ学習はグラフ理論からグラフデータマイニングへと移行してきた。表現学習の出現により、多様なシナリオにおいて顕著なパフォーマンスを達成した。幅広い応用の見通しから、グラフ学習には注意が集まっている。一部の研究者はグラフ学習に関する見事な調査を達成しているが、関連する目的や方法、アプリケーションをより一貫性のある方法で結びつけることに失敗した。その結果、グラフ学習の急速な拡大により、現在の十分なシナリオや課題は含まれなかった。特に、大規模言語モデルは近年、人間の生活に破壊的な影響を与えてきたが、構造化シナリオの相対的な弱点も示している。グラフ学習でこれらのモデルをいかに強力にするかという問題は、まだ未解決のままだ。我々の調査は、グラフ学習と事前訓練された言語モデルの統合における最新の進歩に焦点を当て、特に大規模言語モデルの領域におけるそれらの応用を強調した。グラフ学習に関するこれまでの調査とは違って、グラフ構造の観点から現在の研究を分析し、グラフ学習における最新のアプリケーション、トレンド、課題について論じる総合的なレビューを提供する。具体的には、分類学を提案し、それからグラフ学習の手法を要約する。次に、メインストリームアプリケーションの詳細な解明を提供します。最後に,今後の方向性を提案する。

Graph learning is a prevalent domain that endeavors to learn the intricate relationships among nodes and the topological structure of graphs. Over the years, graph learning has transcended from graph theory to graph data mining. With the advent of representation learning, it has attained remarkable performance in diverse scenarios. Owing to its extensive application prospects, graph learning attracts copious attention. While some researchers have accomplished impressive surveys on graph learning, they failed to connect related objectives, methods, and applications in a more coherent way. As a result, they did not encompass current ample scenarios and challenging problems due to the rapid expansion of graph learning. Particularly, large language models have recently had a disruptive effect on human life, but they also show relative weakness in structured scenarios. The question of how to make these models more powerful with graph learning remains open. Our survey focuses on the most recent advancements in integrating graph learning with pre-trained language models, specifically emphasizing their application within the domain of large language models. Different from previous surveys on graph learning, we provide a holistic review that analyzes current works from the perspective of graph structure, and discusses the latest applications, trends, and challenges in graph learning. Specifically, we commence by proposing a taxonomy and then summarize the methods employed in graph learning. We then provide a detailed elucidation of mainstream applications. Finally, we propose future directions.

公開日:2024-09-21
翻訳日:2024-11-09 15:35:37

# 政策学習の「無」重複:ペシミズムと経験的バーンスタインの不平等の一般化

Policy learning "without" overlap: Pessimism and generalized empirical Bernstein's inequality ( http://arxiv.org/abs/2212.09900v3 )

ライセンス: Link先を確認

Ying Jin, Zhimei Ren, Zhuoran Yang, Zhaoran Wang,

(参考訳) 本論文は, 偏在政策学習において, 事前に収集した事前観測(固定的あるいは適応的に進化する行動方針)を活用して, 与えられた集団に最適な総合的な結果をもたらす最適な個別化決定ルールを学習することを目的とした。既存の政策学習法は、一様重なりの仮定、すなわち、全ての個々の特性に対する全ての作用を探索する正当性は、境界を低くしなければならない。データ収集プロセスをコントロールすることができないため、この仮定は多くの状況において非現実的になり得る。本稿では,政策値の点推定の代わりに低信頼境界(LCB)を最適化する新しいアルゴリズムであるPPLを提案する。 LCBは、オフラインデータを収集するための行動ポリシーの知識を用いて構築される。均一な重なり条件を仮定せずに、我々はアルゴリズムの準最適性に対するデータ依存上界を確立する。一最適方針の重複、及び (ii) 最適化したポリシークラスの複雑さ。すなわち、適応的に収集されたデータに対して、最適動作の確率が時間とともに低い限り、効率的なポリシー学習を確保する一方、最適動作の確率は任意に高速に減少する。理論解析において、逆正当性重み付け推定器のための新しい自己正規化型濃度不等式を開発し、よく知られた経験的ベルンシュタインの不等式を非有界および非非非等式データに一般化する。我々はPPLの有効性を実証する広範囲なシミュレーション研究や実世界の応用と同様に、偏極化とポリシーツリー探索による効率的な最適化アルゴリズムを用いて、我々の理論を補完する。

This paper studies offline policy learning, which aims at utilizing observations collected a priori (from either fixed or adaptively evolving behavior policies) to learn an optimal individualized decision rule that achieves the best overall outcomes for a given population. Existing policy learning methods rely on a uniform overlap assumption, i.e., the propensities of exploring all actions for all individual characteristics must be lower bounded. As one has no control over the data collection process, this assumption can be unrealistic in many situations, especially when the behavior policies are allowed to evolve over time with diminishing propensities for certain actions. In this paper, we propose Pessimistic Policy Learning (PPL), a new algorithm that optimizes lower confidence bounds (LCBs) -- instead of point estimates -- of the policy values. The LCBs are constructed using knowledge of the behavior policies for collecting the offline data. Without assuming any uniform overlap condition, we establish a data-dependent upper bound for the suboptimality of our algorithm, which only depends on (i) the overlap for the optimal policy, and (ii) the complexity of the policy class we optimize over. As an implication, for adaptively collected data, we ensure efficient policy learning as long as the propensities for optimal actions are lower bounded over time, while those for suboptimal ones are allowed to diminish arbitrarily fast. In our theoretical analysis, we develop a new self-normalized type concentration inequality for inverse-propensity-weighting estimators, generalizing the well-known empirical Bernstein's inequality to unbounded and non-i.i.d. data. We complement our theory with an efficient optimization algorithm via Majorization-Minimization and policy tree search, as well as extensive simulation studies and real-world applications that demonstrate the efficacy of PPL.

公開日:2024-09-26
翻訳日:2024-11-09 15:35:37

# 局所駆動型量子磁石の空間熱化

Real space thermalization of locally driven quantum magnets ( http://arxiv.org/abs/2212.13790v2 )

ライセンス: Link先を確認

Ronald Melendrez, Bhaskar Mukherjee, Prakash Sharma, Arijeet Pal, Hitesh J. Changlani,

(参考訳) 孤立系における熱化とその分解の研究は、非平衡量子状態とその初期状態への依存性の深い理解につながった。初期状態の役割は、量子多体散乱(英語版)の存在によって顕著に強調され、基礎となる効果的なスーパースピン構造を持つ特別な熱水状態は、他のカオス多体スペクトルに埋め込まれている。スピン・ハイゼンベルクと$XXZ$モデルとその一次元および高次元の変種は、正確な量子多体傷を負い、合成および凝縮物質系において実現可能なスピンヘリックス状態の完全な復活を示すことが示されている。これらの進歩に触発されて、空間熱化プロファイルを探索し、システムの異なる部位がスーパースピンの寿命にどのように影響するかを明らかにするために、実験的にアクセス可能で、局所的、時間に依存したプロトコルを提案する。我々は、駆動スピンと他のスピンとの相互作用に基づいて、強磁性(X$偏極)初期状態の異なるパラメトリックな状態を特定する。また,スーパースピンが長時間の局所運転に対して回復力を持つパラメータ機構も同定する。数値観測を解説した実空間図とフロケット空間図を作成し,様々な実験装置で検証可能な予測を行う。

The study of thermalization and its breakdown in isolated systems has led to a deeper understanding of non-equilibrium quantum states and their dependence on initial conditions. The role of initial conditions is prominently highlighted by the existence of quantum many-body scars, special athermal states with an underlying effective superspin structure, embedded in an otherwise chaotic many-body spectrum. Spin Heisenberg and $XXZ$ models and their variants in one and higher dimension have been shown to host exact quantum many-body scars, exhibiting perfect revivals of spin helix states that are realizable in synthetic and condensed matter systems. Motivated by these advances, we propose experimentally accessible, local, time-dependent protocols to explore the spatial thermalization profile and highlight how different parts of the system thermalize and affect the fate of the superspin. We identify distinct parametric regimes for the ferromagnetic ($X$-polarized) initial state based on the interplay between the driven spin and the rest, including local athermal behavior where the driven spin effectively decouples, acting like a ``cold" spot while being instrumental in heating up the other spins. We also identify parameter regimes where the superspin remains resilient to local driving for long time scales. We develop a real and Floquet space picture that explains our numerical observations, and make predictions that can be tested in various experimental setups.

公開日:2024-09-26
翻訳日:2024-11-09 15:24:36

# インコンテクスト学習に関する調査研究

A Survey on In-context Learning ( http://arxiv.org/abs/2301.00234v5 )

ライセンス: Link先を確認

Qingxiu Dong, Lei Li, Damai Dai, Ce Zheng, Jingyuan Ma, Rui Li, Heming Xia, Jingjing Xu, Zhiyong Wu, Tianyu Liu, Baobao Chang, Xu Sun, Lei Li, Zhifang Sui,

(参考訳) 大規模言語モデル(LLM)の能力の増大に伴い、インコンテキスト学習(ICL)は自然言語処理(NLP)の新しいパラダイムとして登場し、LLMはいくつかの例で拡張されたコンテキストに基づいて予測を行う。 ICLを探索してLLMの能力を評価・外挿する重要な傾向である。本稿では,ICLの進歩と課題を概観し,整理することを目的とする。まず、ICLの形式的定義を示し、関連する研究との相関を明らかにする。そこで我々は,訓練戦略,迅速な設計戦略,関連する分析など,高度な手法を整理し,議論する。さらに、データエンジニアリングや知識更新など、さまざまなICLアプリケーションシナリオについても検討する。最後に、ICLの課題に対処し、さらなる研究の方向性を提案する。 ICLがどのように機能し、ICLを改善するかについて、私たちの研究がより深く研究されることを願っています。

With the increasing capabilities of large language models (LLMs), in-context learning (ICL) has emerged as a new paradigm for natural language processing (NLP), where LLMs make predictions based on contexts augmented with a few examples. It has been a significant trend to explore ICL to evaluate and extrapolate the ability of LLMs. In this paper, we aim to survey and summarize the progress and challenges of ICL. We first present a formal definition of ICL and clarify its correlation to related studies. Then, we organize and discuss advanced techniques, including training strategies, prompt designing strategies, and related analysis. Additionally, we explore various ICL application scenarios, such as data engineering and knowledge updating. Finally, we address the challenges of ICL and suggest potential directions for further research. We hope that our work can encourage more research on uncovering how ICL works and improving ICL.

公開日:2024-10-05
翻訳日:2024-11-09 15:24:36

# ハニーポットデータにおける教師なし攻撃パターン検出のためのネストディリクレモデル

Nested Dirichlet models for unsupervised attack pattern detection in honeypot data ( http://arxiv.org/abs/2301.02505v3 )

ライセンス: Link先を確認

Francesco Sanna Passino, Anastasia Mantziou, Daniyar Ghani, Philip Thiede, Ross Bevington, Nicholas A. Heard,

(参考訳) サイバーシステムは侵入の試みからほぼ一貫した脅威にさらされている。攻撃の種類は異なるが、それぞれの試みは典型的には特定の意図を持ち、加害者は典型的には同様の目的を持った個人のグループである。共通の意図を共有しているように見えるクラスタリング攻撃は、脅威追跡の専門家にとって非常に価値がある。本稿では、悪意のある攻撃者を誘惑するように設計された特別なネットワークホストであるハニーポットから収集した端末セッションコマンドをクラスタリングするためのディリクレ分布トピックモデルについて検討する。セッションをクラスタリングする主な実践的意味は2つある。様々な統計モデルが検討され、コマンドライン構文の構造に適応している。特に、セカンダリトピックとセカンダリトピックの概念、そしてセッションレベルおよびコマンドレベルトピックの概念が、解釈可能性を改善するためにモデルに導入される。提案手法はさらにベイズ的非パラメトリックな方法で拡張され、語彙サイズと潜在意図数の非有界性を許容する。これらの手法は、従来のトピックモデリングアプローチでは検出されていない、既存の暗号通貨のコインマイニングインフラを乗っ取ろうとする、珍しいMIRAI変異を発見している。

Cyber-systems are under near-constant threat from intrusion attempts. Attacks types vary, but each attempt typically has a specific underlying intent, and the perpetrators are typically groups of individuals with similar objectives. Clustering attacks appearing to share a common intent is very valuable to threat-hunting experts. This article explores Dirichlet distribution topic models for clustering terminal session commands collected from honeypots, which are special network hosts designed to entice malicious attackers. The main practical implications of clustering the sessions are two-fold: finding similar groups of attacks, and identifying outliers. A range of statistical models are considered, adapted to the structures of command-line syntax. In particular, concepts of primary and secondary topics, and then session-level and command-level topics, are introduced into the models to improve interpretability. The proposed methods are further extended in a Bayesian nonparametric fashion to allow unboundedness in the vocabulary size and the number of latent intents. The methods are shown to discover an unusual MIRAI variant which attempts to take over existing cryptocurrency coin-mining infrastructure, not detected by traditional topic-modelling approaches.

公開日:2024-09-22
翻訳日:2024-11-09 15:24:36

# 授業増分学習における効果的な意思決定境界学習

Effective Decision Boundary Learning for Class Incremental Learning ( http://arxiv.org/abs/2301.05180v4 )

ライセンス: Link先を確認

Kunchi Li, Jun Wan, Shan Yu,

(参考訳) クラスインクリメンタルラーニング(CIL)におけるリハーサルアプローチは、知識蒸留のための古いクラスデータの不足と、記憶メモリが限られているため、学習と新しいクラス間の不均衡なデータ学習という2つの要因によって、新しいクラスに過度に適合する決定境界に悩まされる。本研究では,これらの2つの要因に対処するための,単純かつ効果的なアプローチを提案する。まず、再サンプリング戦略とMixup K {\displaystyle K}nowledge D}istillation (Re-MKD)を用いて、KDの性能を改善する。具体的には、学習されたクラスと新しいクラス間の潜伏分布とより整合したKDトレーニングで使用される適切なデータを合成するために、ミックスアップと再サンプリングの戦略を組み合わせる。次に, インフルエンスバランス法をCIL設定に拡張することにより, インクリメンタルインフルエンスバランス(IIB)法を提案する。これら2つの改善により、KDの性能を改善し、不均衡なデータ学習を同時に扱う効果的な決定境界学習アルゴリズム(EDBL)を提案する。実験の結果、EDBLはいくつかのCILベンチマークで最先端のパフォーマンスを達成できた。

Rehearsal approaches in class incremental learning (CIL) suffer from decision boundary overfitting to new classes, which is mainly caused by two factors: insufficiency of old classes data for knowledge distillation and imbalanced data learning between the learned and new classes because of the limited storage memory. In this work, we present a simple but effective approach to tackle these two factors. First, we employ a re-sampling strategy and Mixup K}nowledge D}istillation (Re-MKD) to improve the performances of KD, which would greatly alleviate the overfitting problem. Specifically, we combine mixup and re-sampling strategies to synthesize adequate data used in KD training that are more consistent with the latent distribution between the learned and new classes. Second, we propose a novel incremental influence balance (IIB) method for CIL to tackle the classification of imbalanced data by extending the influence balance method into the CIL setting, which re-weights samples by their influences to create a proper decision boundary. With these two improvements, we present the effective decision boundary learning algorithm (EDBL) which improves the performance of KD and deals with the imbalanced data learning simultaneously. Experiments show that the proposed EDBL achieves state-of-the-art performances on several CIL benchmarks.

公開日:2024-09-26
翻訳日:2024-11-09 15:24:36

# 表面マイニングにおける自動化とAI技術 -Pilbaraにおけるオープンピット操作の簡単な紹介-

Automation and AI Technology in Surface Mining With a Brief Introduction to Open-Pit Operations in the Pilbara ( http://arxiv.org/abs/2301.09771v6 )

ライセンス: Link先を確認

Raymond Leung, Andrew J Hill, Arman Melkumyan,

(参考訳) 本稿では,鉱業,特に西オーストラリアのピルバラ鉄鉱地帯で発生した工学的問題,技術革新,ロボット開発,自動化の取り組みについて概説する。目標は、テクノロジの展望を描き、エンジニアリングのオーディエンスに関連する課題を強調して、AIに対する認識を高め、マイニングにおける自動化のトレンドを高めることだ。これは、読者が鉱業に関する事前の知識を持っていないと仮定し、共通の露天掘り鉱業に関する議論と短い要約を通じて、徐々に文脈を構築していく。主な活動は、資源開発、鉱業、鉄道、港湾業の分野に分類される。鉱物探査から鉱石の出荷まで、この間にはおよそ9つの段階がある。地質学的アセスメント、鉱山計画と開発、生産の掘削と調査、爆破と掘削、鉱石と廃棄物の輸送、解体とスクリーン、ストックパイルとロードアウト、鉄道網の流通、および鉱石車ダンピングなどである。目的は、これらのプロセスを説明し、10年にわたる産業大学と研究開発のパートナーシップの観点から、課題/機会のいくつかについて洞察を提供することである。

This survey article provides a synopsis on some of the engineering problems, technological innovations, robotic development and automation efforts encountered in the mining industry -- particularly in the Pilbara iron-ore region of Western Australia. The goal is to paint the technology landscape and highlight issues relevant to an engineering audience to raise awareness of AI and automation trends in mining. It assumes the reader has no prior knowledge of mining and builds context gradually through focused discussion and short summaries of common open-pit mining operations. The principal activities that take place may be categorized in terms of resource development, mine-, rail- and port operations. From mineral exploration to ore shipment, there are roughly nine steps in between. These include: geological assessment, mine planning and development, production drilling and assaying, blasting and excavation, transportation of ore and waste, crush and screen, stockpile and load-out, rail network distribution, and ore-car dumping. The objective is to describe these processes and provide insights on some of the challenges/opportunities from the perspective of a decade-long industry-university R&D partnership.

公開日:2024-09-27
翻訳日:2024-11-09 15:24:36

# 単軌道分布ロバスト強化学習

Single-Trajectory Distributionally Robust Reinforcement Learning ( http://arxiv.org/abs/2301.11721v2 )

ライセンス: Link先を確認

Zhipeng Liang, Xiaoteng Ma, Jose Blanchet, Jiheng Zhang, Zhengyuan Zhou,

(参考訳) 古典的強化学習(RL)フレームワークが同一のトレーニング環境とテスト環境に大きく依存する限界を軽減するため、分散ロバストRL(DRRL)は、おそらく未知のテスト環境を含む様々な環境のパフォーマンスを高めるために提案されている。ロバスト性ゲインの価格として、DRRLは一連の分布を最適化するが、これは本質的に非ロバストな場合の固定分布を最適化するよりも難しい。既存のDRRLアルゴリズムはモデルベースか、1つのサンプル軌道から学習できないかのいずれかである。本稿では,分散ロバストなQ-ラーニング(DRQ)と呼ばれる,完全モデルフリーなDRRLアルゴリズムを設計する。本研究では,各サンプルを段階的に活用するマルチタイム・フレームワークを微妙に設計し,環境をモデル化せずに最適な分散ロバストなポリシーを直接学習する。アルゴリズムの複雑さにもかかわらず、古典確率近似ツールを一般化することにより漸近収束を保証する。総合的な実験結果から,提案アルゴリズムの頑健性やサンプルの複雑さは,非ロバストな手法や他のロバストなRLアルゴリズムと比較して優れていることが示された。

To mitigate the limitation that the classical reinforcement learning (RL) framework heavily relies on identical training and test environments, Distributionally Robust RL (DRRL) has been proposed to enhance performance across a range of environments, possibly including unknown test environments. As a price for robustness gain, DRRL involves optimizing over a set of distributions, which is inherently more challenging than optimizing over a fixed distribution in the non-robust case. Existing DRRL algorithms are either model-based or fail to learn from a single sample trajectory. In this paper, we design a first fully model-free DRRL algorithm, called distributionally robust Q-learning with single trajectory (DRQ). We delicately design a multi-timescale framework to fully utilize each incrementally arriving sample and directly learn the optimal distributionally robust policy without modelling the environment, thus the algorithm can be trained along a single trajectory in a model-free fashion. Despite the algorithm's complexity, we provide asymptotic convergence guarantees by generalizing classical stochastic approximation tools. Comprehensive experimental results demonstrate the superior robustness and sample complexity of our proposed algorithm, compared to non-robust methods and other robust RL algorithms.

公開日:2024-09-21
翻訳日:2024-11-09 15:24:36

# 位相遷移を厳密に探究する普遍記号を定義する

Defining a universal sign to strictly probe a phase transition ( http://arxiv.org/abs/2301.12438v4 )

ライセンス: Link先を確認

Nvsen Ma, Jun-Song Sun, Gaopei Pan, Chen Cheng, Zheng Yan,

(参考訳) 量子モンテカルロシミュレーションにおける悪名高い符号問題の謎は、フェルミオン系およびフラストレーション系における手法の適用を効果的に制限している。最近の研究 (Science 375, 418 (2022)) では, 相転移の探索に符号を使用できることを指摘し, 符号問題において顕著なブレークスルーをおこなった。本研究では,符号問題と位相遷移が常に厳密に関連付けられないことを示すために,原点と参照系の間の自由エネルギーの差に関連する符号の定義に基づく一般論を提案した。符号は、基準系の自由エネルギーが変数パラメータの下で平坦である場合にのみ、位相遷移を正確にプローブすることができるが、設計はほぼ不可能である。一般に、記号が位相遷移を探索できるという結論は、普遍性のない生存バイアスである。この問題を解決するために,参照システムの影響を排除し,位相遷移を厳密に探索する修正符号を定義する。この研究は、新しい修飾符号によって相転移を検出する不偏解を与える。

The mystery of the infamous sign problem in quantum Monte Carlo simulations mightily restricts applications of the method in fermionic and frustrated systems. A recent work [Science 375, 418 (2022)] made a remarkable breakthrough in the sign problem by pointing out that the sign can be used to probe phase transition. In this work, we proposed a general argument based on the definition of the sign that is related to the difference in free energy between the original and reference systems to clarify that the sign problem and phase transition cannot always be strictly related. The sign can exactly probe phase transition only if the free energy in the reference system is flat under variable parameters, which is almost impossible to design. Generally speaking, the conclusion that the sign can probe phase transition is survivorship bias without universality. To solve this problem, we define a modified sign that excludes the influence of the reference system, which can probe the phase transition strictly. The work gives an unbiased solution for detecting phase transition by the new modified sign.

公開日:2024-09-26
翻訳日:2024-11-09 15:24:36

# W2SAT: 軽量リテラルインシデンスグラフからSATインスタンスを生成する学習

W2SAT: Learning to generate SAT instances from Weighted Literal Incidence Graphs ( http://arxiv.org/abs/2302.00272v2 )

ライセンス: Link先を確認

Weihuang Wen, Tianshu Yu,

(参考訳) ブール満足度(SAT)問題は理論計算機科学において魅力的なNP完全問題であり、幅広いコンピューティング関連アプリケーションにおいて中心的な役割を果たす。多くのシナリオ下でSATソルバの爆発とチューニングを行うには、非常に高品質なSATインスタンスが必要である。そこで本論文では,実世界の実物/産業のインスタンスから本質的な構造と特性を暗黙的に学習し,SAT式を生成するフレームワークであるW2SATを提案する。この目的のために我々は,既存の表現能力と一般化性を示す新たなSAT表現であるWeighted Literal Incidence Graph (WLIG)を導入し,特殊学習に基づくグラフ生成モデルを用いて効率的に生成することができる。 WLIGからSAT問題へのデコーディングは、新しい丘登り最適化手法であるOWC(Optimal Weight Coverage)で重なり合う斜めの発見としてモデル化される。実験では,従来の手法と比較して,グラフメトリクス,効率,拡張性の観点からWLIGによるアプローチの優位性を示す。さらに、実世界のアプリケーションにおけるグラフベースのSAT生成の限界、特にSATソルバパラメータチューニングのために生成されたインスタンスを利用する場合について論じ、潜在的な方向を示す。

The Boolean Satisfiability (SAT) problem stands out as an attractive NP-complete problem in theoretic computer science and plays a central role in a broad spectrum of computing-related applications. Exploiting and tuning SAT solvers under numerous scenarios require massive high-quality industry-level SAT instances, which unfortunately are quite limited in the real world. To address the data insufficiency issue, in this paper, we propose W2SAT, a framework to generate SAT formulas by learning intrinsic structures and properties from given real-world/industrial instances in an implicit fashion. To this end, we introduce a novel SAT representation called Weighted Literal Incidence Graph (WLIG), which exhibits strong representation ability and generalizability against existing counterparts, and can be efficiently generated via a specialized learning-based graph generative model. Decoding from WLIGs into SAT problems is then modeled as finding overlapping cliques with a novel hill-climbing optimization method termed Optimal Weight Coverage (OWC). Experiments demonstrate the superiority of our WLIG-induced approach in terms of graph metrics, efficiency, and scalability in comparison to previous methods. Additionally, we discuss the limitations of graph-based SAT generation for real-world applications, especially when utilizing generated instances for SAT solver parameter-tuning, and pose some potential directions.

公開日:2024-09-24
翻訳日:2024-11-09 15:24:36

# Wasserstein距離におけるロバスト推定

Robust Estimation under the Wasserstein Distance ( http://arxiv.org/abs/2302.01237v2 )

ライセンス: Link先を確認

Sloan Nietert, Rachel Cummings, Ziv Goldfeld,

(参考訳) 本稿では、最適輸送(OT)理論に根ざした確率分布間の一般的な相違尺度であるワッサーシュタイン距離の下でのロバスト分布推定の問題について検討する。未知分布の$\mu$から$n$のサンプルが与えられたとき、$\varepsilon n$は逆向きに破損するので、最小のワッサーシュタイン誤差を持つ$\mu$の見積もりを求める。この課題に対処するために, OT とロバスト統計学の2つのフレームワーク, 部分 OT (POT) と最小距離推定 (MDE) について考察した。我々はPOTの新たな構造特性を証明し、それを用いて、部分的なワッサーシュタイン距離のMDEが、多くの設定において最小最適ロバストな推定リスクを達成することを示す。その過程で、標準的なOTに対して古典的カントロビッチ双対に超ノルムのペナルティを加えるPOTの新しい双対形式を導出する。一般的なWGAN(Warsserstein Generative Adversarial Network)フレームワークは,カンポロビッチ双対性を介してWasserstein MDEを実装しているため,我々のペナル化双対は,WGANに基本的な修正を加えて,汚染データセットを用いた大規模生成モデリングを可能にする。敵の汚職の影響を緩和する手法の有効性を実証する数値実験を行った。

We study the problem of robust distribution estimation under the Wasserstein distance, a popular discrepancy measure between probability distributions rooted in optimal transport (OT) theory. Given $n$ samples from an unknown distribution $\mu$, of which $\varepsilon n$ are adversarially corrupted, we seek an estimate for $\mu$ with minimal Wasserstein error. To address this task, we draw upon two frameworks from OT and robust statistics: partial OT (POT) and minimum distance estimation (MDE). We prove new structural properties for POT and use them to show that MDE under a partial Wasserstein distance achieves the minimax-optimal robust estimation risk in many settings. Along the way, we derive a novel dual form for POT that adds a sup-norm penalty to the classic Kantorovich dual for standard OT. Since the popular Wasserstein generative adversarial network (WGAN) framework implements Wasserstein MDE via Kantorovich duality, our penalized dual enables large-scale generative modeling with contaminated datasets via an elementary modification to WGAN. Numerical experiments demonstrating the efficacy of our approach in mitigating the impact of adversarial corruptions are provided.

公開日:2024-09-24
翻訳日:2024-11-09 15:24:36

# 適応的データ分析のためのサブサンプリング手法

Subsampling Suffices for Adaptive Data Analysis ( http://arxiv.org/abs/2302.08661v3 )

ライセンス: Link先を確認

Guy Blanc,

(参考訳) データセットで行った分析が全人口を代表することを保証することは、統計学における中心的な問題の一つである。ほとんどの古典的なテクニックは、データセットがアナリストのクエリとは独立していると仮定し、データセットが複数の適応的に選択されたクエリのために再利用される一般的な設定に分解する。このemph{adaptive data analysis} の問題は、Dwork et al (STOC, 2015) と Hardt and Ullman (FOCS, 2014) のセミナーで定式化された。クエリが適応的に選択されたとしても、クエリが表現され続けるという、非常に単純な仮定のセットを特定します。この結果は,サブサンプリングに固有のノイズが,クエリ応答の一般化を保証するのに十分であることを示している。このサブサンプルベースのフレームワークの単純さにより、以前の作業でカバーされていないさまざまな現実世界のシナリオをモデル化することができる。その単純さに加えて、統計的クエリと中央値探索という2つの基本的なタスクのメカニズムを設計することで、このフレームワークの有用性を実証する。特に、広く適用可能な統計クエリのクラスに答えるメカニズムは、多くのパラメーターレシエーションにおいて非常に単純かつ最先端である。

Ensuring that analyses performed on a dataset are representative of the entire population is one of the central problems in statistics. Most classical techniques assume that the dataset is independent of the analyst's query and break down in the common setting where a dataset is reused for multiple, adaptively chosen, queries. This problem of \emph{adaptive data analysis} was formalized in the seminal works of Dwork et al. (STOC, 2015) and Hardt and Ullman (FOCS, 2014). We identify a remarkably simple set of assumptions under which the queries will continue to be representative even when chosen adaptively: The only requirements are that each query takes as input a random subsample and outputs few bits. This result shows that the noise inherent in subsampling is sufficient to guarantee that query responses generalize. The simplicity of this subsampling-based framework allows it to model a variety of real-world scenarios not covered by prior work. In addition to its simplicity, we demonstrate the utility of this framework by designing mechanisms for two foundational tasks, statistical queries and median finding. In particular, our mechanism for answering the broadly applicable class of statistical queries is both extremely simple and state of the art in many parameter regimes.

公開日:2024-09-24
翻訳日:2024-11-09 15:24:36

# 視覚変換器の効率的な知識蒸留におけるマスキングの役割

The Role of Masking for Efficient Supervised Knowledge Distillation of Vision Transformers ( http://arxiv.org/abs/2302.10494v4 )

ライセンス: Link先を確認

Seungwoo Son, Jegwang Ryu, Namhoon Lee, Jaeho Lee,

(参考訳) 知識蒸留は、軽量視覚モデルの訓練に有効な方法である。しかし、特に視覚変換器(ViT)のような大規模モデルでは、トレーニングサンプルの教師監督を取得するのにコストがかかることが多い。本稿では,ViT蒸留の監督コストを削減するための簡易な枠組みを開発し,教師に与えられた少量の入力トークンを隠蔽する。入力トークンをマスキングすることで、教師のパラメータやアーキテクチャを変更することなく、マスクされたトークンに関連する計算をスキップすることができる。学生の注意点が最も低いマスキングパッチは、学生の精度を低下させることなく、教師のFLOPの最大50%を節約し、他のマスキング基準は、最適以下の効率向上をもたらす。より詳細な分析により,学生が指導するマスキングが学生に良いカリキュラムを提供することが明らかとなり,教師の指導が早い段階で容易に受けられるようになり,後半の課題も解決できた。

Knowledge distillation is an effective method for training lightweight vision models. However, acquiring teacher supervision for training samples is often costly, especially from large-scale models like vision transformers (ViTs). In this paper, we develop a simple framework to reduce the supervision cost of ViT distillation: masking out a fraction of input tokens given to the teacher. By masking input tokens, one can skip the computations associated with the masked tokens without requiring any change to teacher parameters or architecture. We find that masking patches with the lowest student attention scores is highly effective, saving up to 50% of teacher FLOPs without any drop in student accuracy, while other masking criterion leads to suboptimal efficiency gains. Through in-depth analyses, we reveal that the student-guided masking provides a good curriculum to the student, making teacher supervision easier to follow during the early stage and challenging in the later stage.

公開日:2024-09-27
翻訳日:2024-11-09 15:24:36

# ベイズ行列分解とその応用

Bayesian Matrix Decomposition and Applications ( http://arxiv.org/abs/2302.11337v3 )

ライセンス: Link先を確認

Jun Lu,

(参考訳) 本書の唯一の目的は、行列分解技法をシームレスに導入するために、ベイズ行列分解における概念と数学的ツールを自己完結的に導入することである。しかし、ベイズ行列の分解に関する有用かつ興味深い結果をすべてカバーできないことは明らかであり、最適化を行うための変分推論の分離解析を例に挙げる。ベイズ解析の分野における文献を参照し、関連する分野についてより詳細な解説を行う。この本は、主に目的、重要なベイズ行列分解法、例えば実数値分解、非負行列分解、ベイズ補間分解、およびそれらの応用に光を当てた方法の起源と複雑さの要約である。数学の前提条件は統計学と線型代数の最初のコースである。この控えめな背景以外は、開発は自己完結しており、厳密な証明が提供される。

The sole aim of this book is to give a self-contained introduction to concepts and mathematical tools in Bayesian matrix decomposition in order to seamlessly introduce matrix decomposition techniques and their applications in subsequent sections. However, we clearly realize our inability to cover all the useful and interesting results concerning Bayesian matrix decomposition and given the paucity of scope to present this discussion, e.g., the separated analysis of variational inference for conducting the optimization. We refer the reader to literature in the field of Bayesian analysis for a more detailed introduction to the related fields. This book is primarily a summary of purpose, significance of important Bayesian matrix decomposition methods, e.g., real-valued decomposition, nonnegative matrix factorization, Bayesian interpolative decomposition, and the origin and complexity of the methods which shed light on their applications. The mathematical prerequisite is a first course in statistics and linear algebra. Other than this modest background, the development is self-contained, with rigorous proof provided throughout.

公開日:2024-09-26
翻訳日:2024-11-09 15:24:36

# 相互作用する2つのコールド極性分子の回転特性:線形、対称、非対称トップ

Rotational properties of two interacting cold polar molecules: linear, symmetric, and asymmetric tops ( http://arxiv.org/abs/2303.02199v2 )

ライセンス: Link先を確認

Felipe Isaule, Robert Bennett, Jörg B. Götte,

(参考訳) 我々は、外部dc電場と異方性双極子-双極子相互作用の影響下で、2つの静極分子のポテンシャル-エネルギー曲線と双極子モーメントの偏極について検討した。分子を量子剛性ローターとしてモデル化し、その自由度を考慮し、線形、対称、非対称のトップ分子の選択を考える。電界の分子間分離と方向の異なる双極子のエネルギー曲線と偏極の総合的な検討を行い、分子の性質が短距離分離において磁場の方向に強く依存していることを見出した。後者は、分子双極子気体の自転自由度を説明できる可能性についての洞察を与える。

We examine the potential-energy curves and polarization of the dipole moments of two static polar molecules under the influence of an external dc electric field and their anisotropic dipole-dipole interaction. We model the molecules as quantum rigid rotors to take their rotational degrees of freedom into account and consider a selection of linear, symmetric, and asymmetric top molecules. We provide a comprehensive examination of the energy curves and polarization of the dipoles for varying inter-molecular separation and direction of the electric field and find that the properties of the molecules depend strongly on the field's direction at short separations, showing the importance of accounting for molecular rotation. The latter provides insight into the possible effects of accounting for rotational degrees of freedom in molecular dipolar gases.

公開日:2024-09-23
翻訳日:2024-11-09 15:24:36

# 分割共形予測における経験的カバレッジの普遍的分布

Universal distribution of the empirical coverage in split conformal prediction ( http://arxiv.org/abs/2303.02770v2 )

ライセンス: Link先を確認

Paulo C. Marques F,

(参考訳) スプリット共形予測が交換可能なデータでバッチモードで動作する場合、将来の観測可能量の有限バッチに対して生成された予測セットの実験的カバレッジの正確な分布と、バッチサイズが無限大になるときにそのほぼ確実な限界の正確な分布を決定する。どちらの分布も普遍的であり、名前付きミスカバーレベルとキャリブレーションサンプルサイズのみによって決定されるため、アプリケーションで必要最小限のキャリブレーションサンプルサイズを選択するための基準が確立される。

When split conformal prediction operates in batch mode with exchangeable data, we determine the exact distribution of the empirical coverage of prediction sets produced for a finite batch of future observables, as well as the exact distribution of its almost sure limit when the batch size goes to infinity. Both distributions are universal, being determined solely by the nominal miscoverage level and the calibration sample size, thereby establishing a criterion for choosing the minimum required calibration sample size in applications.

公開日:2024-09-21
翻訳日:2024-11-09 15:24:36

# 審美的不確実性のモデル化のための確率的統一関係--定理証明による意味論と自動推論

Probabilistic unifying relations for modelling epistemic and aleatoric uncertainty: semantics and automated reasoning with theorem proving ( http://arxiv.org/abs/2303.09692v3 )

ライセンス: Link先を確認

Kangfeng Ye, Jim Woodcock, Simon Foster,

(参考訳) 確率的プログラミングは、一般的なコンピュータプログラミング、統計的推論、フォーマルセマンティクスを組み合わせて、不確実性に直面した時にシステムが決定を下すのを助ける。確率的プログラムはユビキタスであり、マシンインテリジェンスに大きな影響を与えている。多くの確率的アルゴリズムは、実際には異なる領域で使われているが、形式的意味論に基づく自動検証は、まだ比較的新しい研究分野である。過去20年間、多くの関心を集めてきた。しかし、多くの課題が残っている。本稿では,確率的統一関係(ProbURel)について述べる。私たちの仕事は、Hehner氏の予測確率的プログラミングに基づいていますが、彼の仕事が広く採用されるにはいくつかの障害があります。ここでのコントリビューションは,(1)Iverson Bracket表記を算術と区別するために導入した文法と意味論の形式化,(2)Unified Theories of Programming(UTP)を用いた関係の形式化,(3)実数の位相空間上の和を用いたブラケット外の確率化,(3)Kleeneの固定点定理を用いた確率ループの構成的意味論,(4)構成的意味論を扱うための分布から部分分布へのセマンティクスと超分布へのセマンティクスの強化,(5)確率ループの推論を単純化するための一意的不動点定理,(6)Isabelle/UTPにおける理論の機械化,そして(6)Isabel/UTTP/HOLにおける実装。ロボットのローカライゼーションの問題,機械学習の分類,確率ループの終了など,6つの事例で研究成果を実演する。

Probabilistic programming combines general computer programming, statistical inference, and formal semantics to help systems make decisions when facing uncertainty. Probabilistic programs are ubiquitous, including having a significant impact on machine intelligence. While many probabilistic algorithms have been used in practice in different domains, their automated verification based on formal semantics is still a relatively new research area. In the last two decades, it has attracted much interest. Many challenges, however, remain. The work presented in this paper, probabilistic unifying relations (ProbURel), takes a step towards our vision to tackle these challenges. Our work is based on Hehner's predicative probabilistic programming, but there are several obstacles to the broader adoption of his work. Our contributions here include (1) the formalisation of its syntax and semantics by introducing an Iverson bracket notation to separate relations from arithmetic; (2) the formalisation of relations using Unifying Theories of Programming (UTP) and probabilities outside the brackets using summation over the topological space of the real numbers; (3) the constructive semantics for probabilistic loops using Kleene's fixed-point theorem; (4) the enrichment of its semantics from distributions to subdistributions and superdistributions to deal with the constructive semantics; (5) the unique fixed-point theorem to simplify the reasoning about probabilistic loops; and (6) the mechanisation of our theory in Isabelle/UTP, an implementation of UTP in Isabelle/HOL, for automated reasoning using theorem proving. We demonstrate our work with six examples, including problems in robot localisation, classification in machine learning, and the termination of probabilistic loops.

公開日:2024-09-26
翻訳日:2024-11-09 15:24:36

# 画像付きマルチモーダルシャノンゲーム

Multimodal Shannon Game with Images ( http://arxiv.org/abs/2303.11192v2 )

ライセンス: Link先を確認

Vilém Zouhar, Sunit Bhattacharya, Ondřej Bojar,

(参考訳) シャノンゲームは長年、言語学やNLPにおける思考実験として使われており、参加者に、前の文脈に基づいて次の文字を推測するよう求めてきた。画像情報の形式でオプションの余分なモダリティを導入することで、ゲームを拡張します。本ゲームにおけるマルチモーダル情報の影響を調べるため,人間と言語モデル(LM, GPT-2)を用いた。画像情報の追加により、人間とLMの両方の自己報告された信頼度と精度が向上することを示す。名詞や決定子などの一部の単語クラスは、追加のモダリティ情報から恩恵を受ける。ヒトとLMの双方のプライミング効果は、文脈サイズが増加するにつれてより明らかになる。これらの知見は、言語理解とモデリングを改善するためのマルチモーダル情報の可能性を強調している。

The Shannon game has long been used as a thought experiment in linguistics and NLP, asking participants to guess the next letter in a sentence based on its preceding context. We extend the game by introducing an optional extra modality in the form of image information. To investigate the impact of multimodal information in this game, we use human participants and a language model (LM, GPT-2). We show that the addition of image information improves both self-reported confidence and accuracy for both humans and LM. Certain word classes, such as nouns and determiners, benefit more from the additional modality information. The priming effect in both humans and the LM becomes more apparent as the context size (extra modality information + sentence context) increases. These findings highlight the potential of multimodal information in improving language understanding and modeling.

公開日:2024-09-27
翻訳日:2024-11-09 15:24:36

# CompoNeRF:編集可能な3Dシーンレイアウトによるテキスト誘導多目的合成型NeRF

CompoNeRF: Text-guided Multi-object Compositional NeRF with Editable 3D Scene Layout ( http://arxiv.org/abs/2303.13843v5 )

ライセンス: Link先を確認

Haotian Bai, Yuanhuiyi Lyu, Lutao Jiang, Sijia Li, Haonan Lu, Xiaodong Lin, Lin Wang,

(参考訳) テキストから3Dの形式は、AR/VRのための編集可能な3Dシーンを作成する上で重要な役割を果たす。最近の進歩は、テキストから3Dオブジェクト生成のための事前訓練された拡散モデルとニューラルラジアンス場(NeRF)を融合させる可能性を示している。しかし、永続的な課題のひとつは、一貫性のあるマルチオブジェクト環境を正確に解析し再生する能力が不十分であることだ。特に、これらのモデルは、多目的テキストによって引き起こされる量やスタイルを正確に表現することが困難であり、しばしば、意味的な複雑さにマッチしないレンダリングの忠実度が崩壊する。さらに、これらの要素をコヒーレントな3Dシーンにアマルゲイトすることは、拡散モデルに固有の一般的な分布から生じる、重大な課題である。そこで我々は,「誘導崩壊」の問題に対処し,さらにシーンの一貫性を高めるために,編集可能な3Dシーンレイアウトとオブジェクト固有およびシーンワイドガイダンス機構を統合することで,CompoNeRFと呼ばれる新しいフレームワークを提案する。複雑なテキストを複数のNeRFで区切られたレイアウトに解釈し、それぞれが対応するサブテキストプロンプトとペアになって、正確なオブジェクトの描写を行う。次に、調整された合成モジュールがこれらのNeRFをシームレスにブレンドし、一貫性を促進し、二重レベルテキストガイダンスは曖昧さを低減し、精度を高める。特に、我々の構成設計では分解が可能である。これにより、編集されたレイアウトやテキストプロンプトに基づいてフレキシブルなシーン編集と新しいシーンへの再構成が可能になる。オープンソースの安定拡散モデルを用いて、CompoNeRFは高忠実度な多目的シーンを生成する。注目すべきは、このフレームワークはマルチビューCLIPスコア測定により、最大で \textbf{54\%} の改善を実現している点である。提案手法は,多目的シーン生成のための意味的精度,多視点一貫性,個人認識性を大幅に向上したことを示す。

Text-to-3D form plays a crucial role in creating editable 3D scenes for AR/VR. Recent advances have shown promise in merging neural radiance fields (NeRFs) with pre-trained diffusion models for text-to-3D object generation. However, one enduring challenge is their inadequate capability to accurately parse and regenerate consistent multi-object environments. Specifically, these models encounter difficulties in accurately representing quantity and style prompted by multi-object texts, often resulting in a collapse of the rendering fidelity that fails to match the semantic intricacies. Moreover, amalgamating these elements into a coherent 3D scene is a substantial challenge, stemming from generic distribution inherent in diffusion models. To tackle the issue of 'guidance collapse' and further enhance scene consistency, we propose a novel framework, dubbed CompoNeRF, by integrating an editable 3D scene layout with object-specific and scene-wide guidance mechanisms. It initiates by interpreting a complex text into the layout populated with multiple NeRFs, each paired with a corresponding subtext prompt for precise object depiction. Next, a tailored composition module seamlessly blends these NeRFs, promoting consistency, while the dual-level text guidance reduces ambiguity and boosts accuracy. Noticeably, our composition design permits decomposition. This enables flexible scene editing and recomposition into new scenes based on the edited layout or text prompts. Utilizing the open-source Stable Diffusion model, CompoNeRF generates multi-object scenes with high fidelity. Remarkably, our framework achieves up to a \textbf{54\%} improvement by the multi-view CLIP score metric. Our user study indicates that our method has significantly improved semantic accuracy, multi-view consistency, and individual recognizability for multi-object scene generation.

公開日:2024-09-24
翻訳日:2024-11-09 15:24:36

# データセットアーチタイプを用いた高レベル合成データ生成

High-Level Synthetic Data Generation with Data Set Archetypes ( http://arxiv.org/abs/2303.14301v3 )

ライセンス: Link先を確認

Michael J. Zellinger, Peter Bühlmann,

(参考訳) クラスタ分析は、異なるアルゴリズムの評価と比較に有効なベンチマークに依存している。クラスタ間の重なり合いやクラスタ形状の変化など,データセットの重要な特徴を効果的に変化させることができるため,合成データのシミュレーション研究が一般的である。残念ながら、評価シナリオのキュレートは、"全く異なる形状のクラスタ"のような高レベルのシナリオ記述と一致するように、実践者は(クラスタ共分散行列のような)低レベルの幾何学的パラメータを見つけなければならないため、しばしば困難である。ベンチマークをより便利かつ有益なものにするために,データセットのアーカイタイプに基づく合成データ生成を提案する。このパラダイムでは、ユーザは高いレベルの評価シナリオを記述し、ソフトウェアは所望の特性を持つデータセットを自動的に生成する。このようなデータセットのアーチタイプと大きな言語モデル(LLM)を組み合わせることで、評価シナリオの言語記述からベンチマークを純粋に設定することができる。このワークフローを実装したオープンソースのPythonパッケージであるreliclustを提供しています。音声入力からのデータ生成のデモはhttps://demo.repliclust.orgで公開されている。

Cluster analysis relies on effective benchmarks for evaluating and comparing different algorithms. Simulation studies on synthetic data are popular because important features of the data sets, such as the overlap between clusters, or the variation in cluster shapes, can be effectively varied. Unfortunately, curating evaluation scenarios is often laborious, as practitioners must find lower-level geometric parameters (like cluster covariance matrices) to match a higher-level scenario description like "clusters with very different shapes." To make benchmarks more convenient and informative, we propose synthetic data generation based on data set archetypes. In this paradigm, the user describes an evaluation scenario in a high-level manner, and the software automatically generates data sets with the desired characteristics. Combining such data set archetypes with large language models (LLMs), it is possible to set up benchmarks purely from verbal descriptions of the evaluation scenarios. We provide an open-source Python package, repliclust, that implements this workflow. A demo of data generation from verbal inputs is available at https://demo.repliclust.org.

公開日:2024-09-21
翻訳日:2024-11-09 15:24:36

# Cesno: 新しいプログラミング言語の初期設計

Cesno: The Initial Design of a New Programming Language ( http://arxiv.org/abs/2303.15750v4 )

ライセンス: Link先を確認

Ozelot Vanilla, Jingxiang Yu, Hemn Barzan Abdalla,

(参考訳) プログラミング言語は非常に多彩で、開発者は個々の要件に合ったアプリケーションやプログラムを作成できます。この記事では、高度でユーザフレンドリで使いやすいプログラミング環境を提供するためにゼロから設計された、Cesnoという新しい言語を紹介します。 Cesnoの構文は他の人気のある言語と似ているため、学習と作業が簡単になる。構文シュガー、組み込みライブラリ、関数型プログラミングのサポート、オブジェクト指向プログラミング、動的型付け、型システム、さまざまな関数パラメータと制約など、他の言語の機能が含まれている。この記事では、Cesnoの文法の設計について検討し、Cesnoがどのようにコードを処理し、コンパイルするかを概観し、Cesnoのコードがどのようなもので、どのように開発に役立てるかを検証します。

Programming languages are incredibly versatile, enabling developers to create applications and programs that suit their individual requirements. This article introduces a new language called Cesno, designed from the ground up to offer an advanced, user-friendly, and easy-to-use programming environment. Cesno's syntax is similar to other popular languages, making it simple to learn and work with. It incorporates features from other languages, such as syntactic sugar, a built-in library, support for functional programming, object-oriented program-ming, dynamic typing, a type system, and a variety of function parameters and restrictions. This article will explore the design of Cesno's grammar, provide a brief overview of how Cesno processes and compiles code, and provide exam-ples of what Cesno's code looks like and how it can aid in development.

公開日:2024-09-22
翻訳日:2024-11-09 15:24:36

# テンソルネットを用いた量子フーリエ変換のシミュレーション、グローバーのアルゴリズム、および限定絡み付き量子カウントアルゴリズム

Simulating the quantum Fourier transform, Grover's algorithm, and the quantum counting algorithm with limited entanglement using tensor-networks ( http://arxiv.org/abs/2304.01751v2 )

ライセンス: Link先を確認

Marcel Niedermeier, Jose L. Lado, Christian Flindt,

(参考訳) 量子アルゴリズムは、計算問題を大きなヒルベルト空間における量子進化として再構成する。ほとんどの量子アルゴリズムは、時間進化は完全にユニタリであり、完全なヒルベルト空間が利用できると仮定する。しかし実際には、利用可能な絡み合いは限られており、量子アルゴリズムの忠実度は低下する。量子回路の絡み合いを制限できるため、量子アルゴリズムの実行を限定的にシミュレートするため、テンソルネットワーク法は有用なフレームワークを提供する。そこで本研究では,量子フーリエ変換,グロバーのアルゴリズム,および量子カウントアルゴリズムのエンタングルメントが減少するにつれて,テンソルネットワークを用いて量子フーリエ変換の忠実度を解析し,各アルゴリズムの実行時に発生するエンタングルメントをマッピングする。いずれの場合も,絡み合いが幾分小さくても,アルゴリズムは高い忠実度で実行可能であることがわかった。この結果は将来の量子コンピュータ上でこれらのアルゴリズムを実行することを約束しており、テンソルネットワークに基づくシミュレーション手法は他の量子アルゴリズムにも適用することができる。

Quantum algorithms reformulate computational problems as quantum evolutions in a large Hilbert space. Most quantum algorithms assume that the time-evolution is perfectly unitary and that the full Hilbert space is available. However, in practice, the available entanglement may be limited, leading to a reduced fidelity of the quantum algorithms. To simulate the execution of quantum algorithms with limited entanglement, tensor-network methods provide a useful framework, since they allow us to restrict the entanglement in a quantum circuit. Thus, we here use tensor-networks to analyze the fidelity of the quantum Fourier transform, Grover's algorithm, and the quantum counting algorithm as the entanglement is reduced, and we map out the entanglement that is generated during the execution of each algorithm. In all three cases, we find that the algorithms can be executed with high fidelity even if the entanglement is somewhat reduced. Our results are promising for the execution of these algorithms on future quantum computers, and our simulation method based on tensor networks may also be applied to other quantum algorithms.

公開日:2024-09-25
翻訳日:2024-11-09 15:24:36

# 作用素空間におけるシュミット分解による量子絡み合いの解析

Analyzing quantum entanglement with the Schmidt decomposition in operator space ( http://arxiv.org/abs/2304.02447v2 )

ライセンス: Link先を確認

Chengjie Zhang, Sophia Denker, Ali Asadian, Otfried Gühne,

(参考訳) 絡み合いを特徴づけることは量子情報科学の中心である。絡み合いを示す特別な観察用具、いわゆる絡み合い証人は、この作業に広く使用される道具である。これらの証人の構成は典型的には、いくつかの絡み合ったターゲット状態に対する高い忠実度を持つ量子状態も絡み合っているという観察に依存している。可観測物のシュミット分解に基づいて絡み合う証人を構築するための一般的な方法を提案する。この方法は、多体システム(多体システム)と二体システム(多体システム)で機能し、忠実度に基づく構造よりも強力である。得られた証人は、絡み合いを定量化したり、その次元を特徴づけるためにも使うことができる。最後に,本手法が絡み込み検出を大幅に改善する実験例について述べる。

Characterizing entanglement is central for quantum information science. Special observables which indicate entanglement, so-called entanglement witnesses, are a widely used tool for this task. The construction of these witnesses typically relies on the observation that quantum states with a high fidelity to some entangled target state are entangled, too. We introduce a general method to construct entanglement witnesses based on the Schmidt decomposition of observables. The method works for two- and, more importantly, many-body systems and is strictly stronger than fidelity-based constructions. The resulting witnesses can also be used to quantify entanglement as well as to characterize the dimensionality of it. Finally, we present experimentally relevant examples, where our approach improves entanglement detection significantly.

公開日:2024-09-27
翻訳日:2024-11-09 15:13:22

# 神経集団動態と幾何学の解釈可能な統計的表現

Interpretable statistical representations of neural population dynamics and geometry ( http://arxiv.org/abs/2304.03376v4 )

ライセンス: Link先を確認

Adam Gosztolai, Robert L. Peach, Alexis Arnaudon, Mauricio Barahona, Pierre Vandergheynst,

(参考訳) ニューロンの集団のダイナミクスは、低次元多様体上で一般的に進化する。したがって、解釈可能かつ一貫した潜在表現を推論するために、ニューラル多様体上の動的過程を学ぶ方法が必要である。そこで我々は,manifold dynamics を局所流れ場に分解する表現学習法 MARBLE を導入し,教師なしの幾何学的深層学習を用いて,それらを共通潜時空間にマッピングする。シミュレーションされた非線形力学系, 繰り返しニューラルネットワーク, 霊長類および象牙類からの実験的な単一ニューロン記録において, 利得変調, 意思決定, 内部状態の変化の間に高次元神経力学をパラメトリーする創発的な低次元潜在表現が発見された。これらの表現はニューラルネットワークや動物間で一貫性があり、認知計算の堅牢な比較を可能にする。広範囲なベンチマークでは、MARBLEの最先端の内的および対人的デコード精度が、現在の表現学習アプローチと比較して、最小限のユーザ入力で示される。この結果から, 多様体構造は, 強力な復号アルゴリズムを開発し, 実験間でデータを同化するために, 強力な帰納バイアスを与えることが示唆された。

The dynamics of neuron populations commonly evolve on low-dimensional manifolds. Thus, we need methods that learn the dynamical processes over neural manifolds to infer interpretable and consistent latent representations. We introduce a representation learning method, MARBLE, that decomposes on-manifold dynamics into local flow fields and maps them into a common latent space using unsupervised geometric deep learning. In simulated non-linear dynamical systems, recurrent neural networks, and experimental single-neuron recordings from primates and rodents, we discover emergent low-dimensional latent representations that parametrise high-dimensional neural dynamics during gain modulation, decision-making, and changes in the internal state. These representations are consistent across neural networks and animals, enabling the robust comparison of cognitive computations. Extensive benchmarking demonstrates state-of-the-art within- and across-animal decoding accuracy of MARBLE compared with current representation learning approaches, with minimal user input. Our results suggest that manifold structure provides a powerful inductive bias to develop powerful decoding algorithms and assimilate data across experiments.

公開日:2024-09-24
翻訳日:2024-11-09 15:13:22

# CRISP:階層強化学習のための原始インフォームドサブゴール予測のカリキュラム化

CRISP: Curriculum Inducing Primitive Informed Subgoal Prediction for Hierarchical Reinforcement Learning ( http://arxiv.org/abs/2304.03535v5 )

ライセンス: Link先を確認

Utsav Singh, Vinay P. Namboodiri,

(参考訳) 階層的強化学習(HRL)は、時間的抽象を用いて複雑な長い地平線問題を解く有望な手法である。しかし、低レベルのプリミティブが非定常である場合、高レベルのポリシーを訓練することが難しいため、同時にポリシー階層を学習することは不安定である。本稿では、強化学習と模倣学習を用いて、低レベルのプリミティブを進化させるための達成可能なサブゴールのカリキュラムを効果的に生成する新しいHRLアルゴリズムであるCRISPを提案する。 CRISPは低レベルのプリミティブを使用して、少数の専門家によるデモンストレーションで定期的にデータレバーベリングを行い、新しいプリミティブインフォメーションパーシング(PIP)アプローチを使用して、非定常性を緩和する。私たちのアプローチでは、少数の専門家によるデモンストレーションにしかアクセスできないので、ほとんどのロボット制御タスクに適しています。複雑なロボット迷路ナビゲーションとロボット操作タスクの実験的評価は、階層的なカリキュラム学習の導入がサンプル効率を大幅に改善し、時間的に拡張されたタスクを解決するための効率的な目標条件付きポリシーをもたらすことを示した。さらに,複雑な操作タスクにおける実世界のロボット実験を行い,CRISPが実世界のシナリオにおける印象的な一般化を実証した。

Hierarchical reinforcement learning (HRL) is a promising approach that uses temporal abstraction to solve complex long horizon problems. However, simultaneously learning a hierarchy of policies is unstable as it is challenging to train higher-level policy when the lower-level primitive is non-stationary. In this paper, we present CRISP, a novel HRL algorithm that effectively generates a curriculum of achievable subgoals for evolving lower-level primitives using reinforcement learning and imitation learning. CRISP uses the lower level primitive to periodically perform data relabeling on a handful of expert demonstrations, using a novel primitive informed parsing (PIP) approach, thereby mitigating non-stationarity. Since our approach only assumes access to a handful of expert demonstrations, it is suitable for most robotic control tasks. Experimental evaluations on complex robotic maze navigation and robotic manipulation tasks demonstrate that inducing hierarchical curriculum learning significantly improves sample efficiency, and results in efficient goal conditioned policies for solving temporally extended tasks. Additionally, we perform real world robotic experiments on complex manipulation tasks and demonstrate that CRISP demonstrates impressive generalization in real world scenarios.

公開日:2024-09-24
翻訳日:2024-11-09 15:13:22

# 多分、交通分析防衛のためのフレームワーク

Maybenot: A Framework for Traffic Analysis Defenses ( http://arxiv.org/abs/2304.09510v2 )

ライセンス: Link先を確認

Tobias Pulls, Ethan Witwer,

(参考訳) エンドツーエンド暗号化は、インターネットユーザのプライバシーを保護する強力なツールである。 TorやVPN、暗号化メッセージングといった技術の利用の増加とともに、ネットワーク敵がインターネットトラフィックを監視して検閲することがますます難しくなってきている。トラフィック分析: 暗号化されたトラフィックのパターンを分析し、ユーザとその活動に関する情報を推測する。ディープラーニングによる最近の改善により、トラフィック分析攻撃はこれまで以上に効果的になった。我々は、交通分析防衛のためのフレームワークであるM maynotを提示する。おそらくnotは使いやすく、既存のエンドツーエンドの暗号化プロトコルに統合できるように設計されている。これはRustプログラミング言語でクレート(ライブラリ)として実装され、ディフェンスの開発をさらに進めるためのシミュレータとともに実装されている。 maynotのディフェンスは、パディングを注入したり、トラフィックをブロックしたりするためのアクションをスケジュールする確率的状態マシンとして表現される。おそらく、Perry氏とKadianakis氏によるTor Circuit Padding Frameworkからの進化であり、幅広いプロトコルとユースケースをサポートするように設計されている。

End-to-end encryption is a powerful tool for protecting the privacy of Internet users. Together with the increasing use of technologies such as Tor, VPNs, and encrypted messaging, it is becoming increasingly difficult for network adversaries to monitor and censor Internet traffic. One remaining avenue for adversaries is traffic analysis: the analysis of patterns in encrypted traffic to infer information about the users and their activities. Recent improvements using deep learning have made traffic analysis attacks more effective than ever before. We present Maybenot, a framework for traffic analysis defenses. Maybenot is designed to be easy to use and integrate into existing end-to-end encrypted protocols. It is implemented in the Rust programming language as a crate (library), together with a simulator to further the development of defenses. Defenses in Maybenot are expressed as probabilistic state machines that schedule actions to inject padding or block outgoing traffic. Maybenot is an evolution from the Tor Circuit Padding Framework by Perry and Kadianakis, designed to support a wide range of protocols and use cases.

公開日:2024-09-27
翻訳日:2024-11-09 15:13:22

# 個人データフローの可視化:Booking.comの事例から

Visualising Personal Data Flows: Insights from a Case Study of Booking.com ( http://arxiv.org/abs/2304.09603v5 )

ライセンス: Link先を確認

Haiyue Yuan, Matthew Boakes, Xiao Ma, Dongmei Cao, Shujun Li,

(参考訳) 商業組織は、絶え間なく増加する個人情報を保持し、処理している。ポリシーや法律は、これらの企業がデータの収集、保管、処理、共有に関してより透明性を持たなければならないように、継続的に変更されている。本稿では、プライバシポリシから抽出した個人データフローを可視化するケーススタディとして、Booking.comを取り上げている。消費者の個人情報の共有方法を示すことによって、私たちは質問を提起し、プライバシポリシを使用してオンラインユーザに対して、個人データフローの真の規模と状況について通知する際の課題と制限に関する議論を拡大します。このケーススタディは、よりデータフロー指向のプライバシポリシ分析に関する今後の研究や、複雑なビジネスエコシステムにおける個人データフローに関するより包括的なオントロジーの構築について教えてくれます。

Commercial organisations are holding and processing an ever-increasing amount of personal data. Policies and laws are continually changing to require these companies to be more transparent regarding the collection, storage, processing and sharing of this data. This paper reports our work of taking Booking.com as a case study to visualise personal data flows extracted from their privacy policy. By showcasing how the company shares its consumers' personal data, we raise questions and extend discussions on the challenges and limitations of using privacy policies to inform online users about the true scale and the landscape of personal data flows. This case study can inform us about future research on more data flow-oriented privacy policy analysis and on the construction of a more comprehensive ontology on personal data flows in complicated business ecosystems.

公開日:2024-09-20
翻訳日:2024-11-09 15:13:22

# CKBP v2: Commonsense Knowledge Base Populationのためのアノテーションと推論の改善

CKBP v2: Better Annotation and Reasoning for Commonsense Knowledge Base Population ( http://arxiv.org/abs/2304.10392v2 )

ライセンス: Link先を確認

Tianqing Fang, Quyet V. Do, Zihao Zheng, Weiqi Wang, Sehyun Choi, Zhaowei Wang, Yangqiu Song,

(参考訳) Commonsense Knowledge Bases (CSKB) Populationは、CSKBの知識を外部リソースで自動的に拡張することを目的としており、NLPにおいて重要なタスクである。 Fang et al (2021a) は CKBP v1 の評価セットを持つ CSKB Population (CKBP) フレームワークを提案した。しかし、CKBP v1は、かなりの数の誤った回答に苦しむクラウドソースアノテーションに依存しており、評価セットはランダムサンプリングによる外部知識ソースとの整合性に欠ける。本稿では,上記の2つの問題に,ドメインエキスパートをアノテータとして採用し,多種多様な反対サンプルを取り入れて,評価データをより代表的なものにすることで対処する,高品質なCSKB集団評価セットであるCKBP v2を紹介する。 CKBP v2 は CSKB Population タスクの挑戦的,代表的評価データセットとして機能し,その開発セットは,下流コモンセンス推論の知識獲得に寄与する集団モデルの選択を支援する。より良い人口モデルは、生成的コモンセンス推論とゼロショットコモンセンス質問応答の両方の監視信号として、より情報的なコモンセンス知識を得るのに役立つ。具体的には、DeBERTa-v3-large(He et al , 2023b)に基づく質問応答モデルは、ChatGPTやGPT-3.5など、ゼロショット設定で強力な大規模言語モデルよりも優れている。

Commonsense Knowledge Bases (CSKB) Population, which aims at automatically expanding knowledge in CSKBs with external resources, is an important yet hard task in NLP. Fang et al. (2021a) proposed a CSKB Population (CKBP) framework with an evaluation set CKBP v1. However, CKBP v1 relies on crowdsourced annotations that suffer from a considerable number of mislabeled answers, and the evaluationset lacks alignment with the external knowledge source due to random sampling. In this paper, we introduce CKBP v2, a new high-quality CSKB Population evaluation set that addresses the two aforementioned issues by employing domain experts as annotators and incorporating diversified adversarial samples to make the evaluation data more representative. We show that CKBP v2 serves as a challenging and representative evaluation dataset for the CSKB Population task, while its development set aids in selecting a population model that leads to improved knowledge acquisition for downstream commonsense reasoning. A better population model can also help acquire more informative commonsense knowledge as additional supervision signals for both generative commonsense inference and zero-shot commonsense question answering. Specifically, the question-answering model based on DeBERTa-v3-large (He et al., 2023b) even outperforms powerful large language models in a zero-shot setting, including ChatGPT and GPT-3.5.

公開日:2024-09-21
翻訳日:2024-11-09 15:13:22

# 非凸非平滑最適化問題に対する射影近位勾配:クルディカ・ロジャシエヴィチ(KL)特性のない高速収束

Projective Proximal Gradient Descent for A Class of Nonconvex Nonsmooth Optimization Problems: Fast Convergence Without Kurdyka-Lojasiewicz (KL) Property ( http://arxiv.org/abs/2304.10499v2 )

ライセンス: Link先を確認

Yingzhen Yang, Ping Li,

(参考訳) 非凸および非滑らかな最適化問題は統計学と機械学習にとって重要かつ困難な問題である。本稿では,非凸・非平滑な最適化問題のクラスを非凸・非平滑な非平滑な正規化項から解き,非凸・非平滑な最適化問題であるPGD(Projected Proximal Gradient Descent)を提案する。クルディカ・オジャシエヴィチ(K\L{}ojasiewicz)の性質に基づく非凸および非滑らか問題に対する加速PGD法の既存の収束解析とは対照的に、PPGDの局所的高速収束を示す新しい理論解析を提供する。 PPGDは、緩やかな仮定の下での非凸および非滑らかな問題のクラスにおいて、反復数 $k \ge k_0$ for a finite $k_0$ に対して $\cO(1/k^2)$ の高速収束率を達成することが証明された。実験の結果, PPGDの有効性が示された。

Nonconvex and nonsmooth optimization problems are important and challenging for statistics and machine learning. In this paper, we propose Projected Proximal Gradient Descent (PPGD) which solves a class of nonconvex and nonsmooth optimization problems, where the nonconvexity and nonsmoothness come from a nonsmooth regularization term which is nonconvex but piecewise convex. In contrast with existing convergence analysis of accelerated PGD methods for nonconvex and nonsmooth problems based on the Kurdyka-\L{}ojasiewicz (K\L{}) property, we provide a new theoretical analysis showing local fast convergence of PPGD. It is proved that PPGD achieves a fast convergence rate of $\cO(1/k^2)$ when the iteration number $k \ge k_0$ for a finite $k_0$ on a class of nonconvex and nonsmooth problems under mild assumptions, which is locally Nesterov's optimal convergence rate of first-order methods on smooth and convex objective function with Lipschitz continuous gradient. Experimental results demonstrate the effectiveness of PPGD.

公開日:2024-09-25
翻訳日:2024-11-09 15:13:22

# RoCOCO:MS-COCOのストレステスト画像テキストマッチングモデルに対するロバスト性ベンチマーク

RoCOCO: Robustness Benchmark of MS-COCO to Stress-test Image-Text Matching Models ( http://arxiv.org/abs/2304.10727v4 )

ライセンス: Link先を確認

Seulki Park, Daeho Um, Hajung Yoon, Sanghyuk Chun, Sangdoo Yun,

(参考訳) 様々な下流タスクで視覚言語モデルが広く使われているため、その堅牢性を評価することが重要である。本稿では,視覚言語モデルのロバスト性を評価するためのベンチマークを提案する。我々は、ロバストモデルが言語的意味論と視覚的意味論の両方を適切に理解し、明示的なバリエーションに耐性があることを信じている。この目的を追求するため、MS-COCOテストセットにテキストと画像の新しい変種を作成し、新しいデータを用いてSOTA(State-of-the-art)モデルを再評価する。具体的には、単語を置換してテキストの意味を変更し、画像ミキシング技術を用いて視覚的に変化した画像を生成する。提案したベンチマークでは、多くのSOTAモデル(例えば、画像からテキストへのリコール@1:81.9\% $\rightarrow$ 48.4\%、BLIP 66.1\% $\rightarrow$ 37.6\%、VSE$\infty$)において、大きなパフォーマンス劣化を示す。これは、現在の視覚言語モデルは微妙な変化に悩まされ、しばしばテキストや画像の全体的なコンテキストを理解するのに失敗していることを示している。これらの知見に基づき,より堅牢な埋め込み学習のために,意味的コントラスト損失と視覚的コントラスト損失を提案する。データセットとコードは {\url{https://github.com/pseulki/rococo}}で入手できる。

With the extensive use of vision-language models in various downstream tasks, evaluating their robustness is crucial. In this paper, we propose a benchmark for assessing the robustness of vision-language models. We believe that a robust model should properly understand both linguistic and visual semantics and be resilient to explicit variations. In pursuit of this goal, we create new variants of texts and images in the MS-COCO test set and re-evaluate the state-of-the-art (SOTA) models with the new data. Specifically, we alter the meaning of text by replacing a word, and generate visually altered images that maintain some visual context while introducing noticeable pixel changes through image mixing techniques.Our evaluations on the proposed benchmark reveal substantial performance degradation in many SOTA models (e.g., Image-to-Text Recall@1: 81.9\% $\rightarrow$ 48.4\% in BLIP, 66.1\% $\rightarrow$ 37.6\% in VSE$\infty$), with the models often favoring the altered texts/images over the original ones. This indicates the current vision-language models struggle with subtle changes and often fail to understand the overall context of texts and images. Based on these findings, we propose semantic contrastive loss and visual contrastive loss to learn more robust embedding. Datasets and code are available at {\url{https://github.com/pseulki/rococo}}.

公開日:2024-09-27
翻訳日:2024-11-09 15:13:22

# サービス拒否とファイングラインド制御--フレキシブルモデルによるフェデレート学習への攻撃に向けて

Denial-of-Service or Fine-Grained Control: Towards Flexible Model Poisoning Attacks on Federated Learning ( http://arxiv.org/abs/2304.10783v3 )

ライセンス: Link先を確認

Hangtao Zhang, Zeming Yao, Leo Yu Zhang, Shengshan Hu, Chao Chen, Alan Liew, Zhetao Li,

(参考訳) フェデレーテッド・ラーニング(FL)は、敵がグローバルアグリゲーションの結果を腐敗させ、DoS(DoS)を否定する有害な攻撃に対して脆弱である。特定方向の悪意的摂動の振幅を最適化してDoSを発生させる最近のモデル中毒攻撃とは違って,汎用的な攻撃目標を達成するフレキシブルモデル中毒攻撃(FMPA)を提案する。 FLシステムに関する余分な知識(例えば、アグリゲーションルールやベニグナブルデバイスのアップデートなど)を敵に提供できない現実的な脅威シナリオを考える。 FMPAは、グローバルな歴史的情報を利用して、グローバルモデルの次のラウンドを良心的な参照として予測する推定器を構築する。その後、基準モデルを微調整し、低い精度と小さな摂動で所望の有毒モデルを得る。 DoSを発生させる目的の他に、FMPAを自然に拡張して細かい制御可能な攻撃を発射することで、グローバルな精度を正確に低減することができる。厳格なコントロールで武装した悪意のあるFLサービスプロバイダは、注意を払わずに競合相手に対してアドバンテージを得られるため、DoS以外のFLに新たな攻撃サーフェスを開くことができる。 DoSの目的においても、FMPAは世界の精度を著しく低下させ、最先端の6つの攻撃を上回ります。

Federated learning (FL) is vulnerable to poisoning attacks, where adversaries corrupt the global aggregation results and cause denial-of-service (DoS). Unlike recent model poisoning attacks that optimize the amplitude of malicious perturbations along certain prescribed directions to cause DoS, we propose a Flexible Model Poisoning Attack (FMPA) that can achieve versatile attack goals. We consider a practical threat scenario where no extra knowledge about the FL system (e.g., aggregation rules or updates on benign devices) is available to adversaries. FMPA exploits the global historical information to construct an estimator that predicts the next round of the global model as a benign reference. It then fine-tunes the reference model to obtain the desired poisoned model with low accuracy and small perturbations. Besides the goal of causing DoS, FMPA can be naturally extended to launch a fine-grained controllable attack, making it possible to precisely reduce the global accuracy. Armed with precise control, malicious FL service providers can gain advantages over their competitors without getting noticed, hence opening a new attack surface in FL other than DoS. Even for the purpose of DoS, experiments show that FMPA significantly decreases the global accuracy, outperforming six state-of-the-art attacks.

公開日:2024-09-26
翻訳日:2024-11-09 15:13:22

# 確率的エージェントドロップアウト下におけるマルチエージェントMDPのモデル自由学習と最適ポリシー設計

Model-Free Learning and Optimal Policy Design in Multi-Agent MDPs Under Probabilistic Agent Dropout ( http://arxiv.org/abs/2304.12458v2 )

ライセンス: Link先を確認

Carmel Fiscko, Soummya Kar, Bruno Sinopoli,

(参考訳) 本研究では,エージェントドロップアウトを行うマルチエージェントマルコフ決定プロセス(MDP)と,事前ドロップアウトシステムの制御とサンプリングに基づくポストドロップアウトシステムのポリシーの計算について検討する。中央プランナーの目的は、エージェントのドロップアウト確率の事前知識が与えられた場合、期待されるシステムの価値を最大化する最適なポリシーを見つけることである。特定の遷移独立性と報酬分離性構造を持つMDPに対して、システムからエージェントを取り除くことは、新しい状態と行動空間を持つ残りのエージェントと、除去されたエージェントを疎外する遷移ダイナミクスと、除去されたエージェントとは独立な報酬からなる新しいMDPを形成すると仮定する。この「ロバストMDP」は、Nがエージェント数を表すようなシステムの2ドルN$実現度を全て評価する必要性を排除している。さらに、モデルフリーの文脈では、ロバストなMDP値を事前ドロップアウトシステムによって生成されたサンプルで推定できることが示され、つまり、ドロップアウトが起こる前にロバストなポリシーを見つけることができる。この事実は、ドロップアウトシナリオに対するポリシー評価を行うための政策重要サンプリング(IS)ルーチンの提案に利用され、既存のシステムを適切な事前ドロップアウトポリシーで制御する。ポリシーISルーチンは、堅牢なMDPと特定のドロップアウトシステムの実現の両方に対して値推定を生成し、指数的信頼境界で正当化される。最後に、このアプローチの有用性をシミュレーションで検証し、エージェントのドロップアウトの構造的特性が、ドロップアウトが起こる前にコントローラが優れたドロップアウトポリシーを見つけるのにどう役立つかを示す。

This work studies a multi-agent Markov decision process (MDP) that can undergo agent dropout and the computation of policies for the post-dropout system based on control and sampling of the pre-dropout system. The central planner's objective is to find an optimal policy that maximizes the value of the expected system given a priori knowledge of the agents' dropout probabilities. For MDPs with a certain transition independence and reward separability structure, we assume that removing agents from the system forms a new MDP comprised of the remaining agents with new state and action spaces, transition dynamics that marginalize the removed agents, and rewards that are independent of the removed agents. We first show that under these assumptions, the value of the expected post-dropout system can be represented by a single MDP; this "robust MDP" eliminates the need to evaluate all $2^N$ realizations of the system, where N denotes the number of agents. More significantly, in a model-free context, it is shown that the robust MDP value can be estimated with samples generated by the pre-dropout system, meaning that robust policies can be found before dropout occurs. This fact is used to propose a policy importance sampling (IS) routine that performs policy evaluation for dropout scenarios while controlling the existing system with good pre-dropout policies. The policy IS routine produces value estimates for both the robust MDP and specific post-dropout system realizations and is justified with exponential confidence bounds. Finally, the utility of this approach is verified in simulation, showing how structural properties of agent dropout can help a controller find good post-dropout policies before dropout occurs.

公開日:2024-09-22
翻訳日:2024-11-09 15:13:22

# 1ビット行列補完のための正規化最小化ガウスニュートン法

A Majorization-Minimization Gauss-Newton Method for 1-Bit Matrix Completion ( http://arxiv.org/abs/2304.13940v3 )

ライセンス: Link先を確認

Xiaoqian Liu, Xu Han, Eric C. Chi, Boaz Nadler,

(参考訳) 1ビット行列の完備化では、基礎となる低ランク行列をバイナリー観測の部分集合から推定することを目的としている。本稿では,Majorization-Minimization Gauss-Newton (MMGN) と呼ばれる新しい1ビット行列補完法を提案する。本手法は,元の最適化問題を標準的な低ランク行列補完問題に変換する偏極最小化原理に基づく。これらのサブプロブレムのそれぞれを、仮定された低ランク構造を明示的に強制する分解法により解き、その後、ガウス・ニュートン法を適用する。シミュレーションと実データ例を用いて、既存の1ビット行列補完法と比較して、MMGNはより正確な推定値でない場合に匹敵する出力を出力する。加えて、これはしばしば著しく速く、下層のマトリックスのスパイキネスに敏感でない。元の目的を直接最小化する3つの標準的な汎用最適化手法と比較して、MMGNは特に観測された成分のごく一部が小さい場合に、明確な計算上の優位性を示す。

In 1-bit matrix completion, the aim is to estimate an underlying low-rank matrix from a partial set of binary observations. We propose a novel method for 1-bit matrix completion called Majorization-Minimization Gauss-Newton (MMGN). Our method is based on the majorization-minimization principle, which converts the original optimization problem into a sequence of standard low-rank matrix completion problems. We solve each of these sub-problems by a factorization approach that explicitly enforces the assumed low-rank structure and then apply a Gauss-Newton method. Using simulations and a real data example, we illustrate that in comparison to existing 1-bit matrix completion methods, MMGN outputs comparable if not more accurate estimates. In addition, it is often significantly faster, and less sensitive to the spikiness of the underlying matrix. In comparison with three standard generic optimization approaches that directly minimize the original objective, MMGN also exhibits a clear computational advantage, especially when the fraction of observed entries is small.

公開日:2024-09-23
翻訳日:2024-11-09 15:13:22

# 移動エゴ車からのイベントフリー移動物体セグメンテーション

Event-Free Moving Object Segmentation from Moving Ego Vehicle ( http://arxiv.org/abs/2305.00126v3 )

ライセンス: Link先を確認

Zhuyun Zhou, Zongwei Wu, Danda Pani Paudel, Rémi Boutteau, Fan Yang, Luc Van Gool, Radu Timofte, Dominique Ginhac,

(参考訳) 動的シーンにおける移動物体セグメンテーション(MOS)は、特に移動するエゴ車から得られるシーケンスについて、重要な、難しい、しかし未調査の研究テーマである。ほとんどのセグメンテーション法は、光学フローマップから得られるモーションキューを利用する。しかし、これらの手法は連続するRGBフレームから事前計算される光学的流れに基づいていることが多いため、フレーム内で発生した事象の時間的考慮を無視して、相対的な静的性を示すが実際に動いている物体を識別する能力を制限する。これらの制約に対処するために,光学的フローに頼ることなくリッチなモーションキューを提供する,より優れた映像理解のためのイベントカメラの利用を提案する。この分野での研究を促進するために、我々はまずDSEC-MOSと呼ばれる新しい大規模データセットを導入し、移動中のエゴ車から物体のセグメンテーションを移動させる。ベンチマークでは、さまざまな主流メソッドを選択し、データセット上でそれらを厳格に評価する。その後、イベントデータを活用可能な新しいネットワークであるEmoFormerを考案した。この目的のために、時間的前兆を空間意味マップと融合させ、実際に動く物体を静的な背景から区別し、興味のある物体の周囲に別のレベルの集中的な監督を加える。提案するネットワークは,トレーニングにイベントデータのみに依存するが,推論時にイベント入力を必要としないため,効率の面でフレームのみの手法と直接的に比較でき,多くのアプリケーションでより広く利用することができる。徹底的な比較は、他のすべての方法と比較して、我々の手法の大幅な性能向上を浮き彫りにしている。ソースコードとデータセットは、https://github.com/ZZY-Zhou/DSEC-MOSで公開されている。

Moving object segmentation (MOS) in dynamic scenes is an important, challenging, but under-explored research topic for autonomous driving, especially for sequences obtained from moving ego vehicles. Most segmentation methods leverage motion cues obtained from optical flow maps. However, since these methods are often based on optical flows that are pre-computed from successive RGB frames, this neglects the temporal consideration of events occurring within the inter-frame, consequently constraining its ability to discern objects exhibiting relative staticity but genuinely in motion. To address these limitations, we propose to exploit event cameras for better video understanding, which provide rich motion cues without relying on optical flow. To foster research in this area, we first introduce a novel large-scale dataset called DSEC-MOS for moving object segmentation from moving ego vehicles, which is the first of its kind. For benchmarking, we select various mainstream methods and rigorously evaluate them on our dataset. Subsequently, we devise EmoFormer, a novel network able to exploit the event data. For this purpose, we fuse the event temporal prior with spatial semantic maps to distinguish genuinely moving objects from the static background, adding another level of dense supervision around our object of interest. Our proposed network relies only on event data for training but does not require event input during inference, making it directly comparable to frame-only methods in terms of efficiency and more widely usable in many application cases. The exhaustive comparison highlights a significant performance improvement of our method over all other methods. The source code and dataset are publicly available at: https://github.com/ZZY-Zhou/DSEC-MOS.

公開日:2024-09-25
翻訳日:2024-11-09 15:13:22

# 人工知能によるアグリフードシステムの構築 : 進歩・課題・機会に関する調査

Empowering Agrifood System with Artificial Intelligence: A Survey of the Progress, Challenges and Opportunities ( http://arxiv.org/abs/2305.01899v2 )

ライセンス: Link先を確認

Tao Chen, Liang Lv, Di Wang, Jing Zhang, Yue Yang, Zeyang Zhao, Chen Wang, Xiaowei Guo, Hao Chen, Qingye Wang, Yufei Xu, Qiming Zhang, Bo Du, Liangpei Zhang, Dacheng Tao,

(参考訳) 世界人口が急増するにつれて、アグリフードのシステムはより生産的、効率的、安全、持続的へと変化し、潜在的な食糧不足を緩和するためには不可欠である。近年、ディープラーニング(DL)のような人工知能(AI)技術は、言語、視覚、リモートセンシング(RS)、アグリフードシステムアプリケーションなど、様々な分野でその強みを実証している。しかし、アグリフードシステムに対するAIの全体的な影響は、まだ不明である。本稿では,AI技術がアグリフードシステムをどのように変革し,現代のアグリフード産業に貢献するかを,徹底的にレビューする。まず,アグリファドシステムにおけるデータ取得手法について概説する。第2に,農業,畜産,漁業などのアグリフードシステムにおけるAI手法の進歩を概観し,アグリフード分類,成長モニタリング,収量予測,品質評価などのトピックについて紹介する。さらに、AIで現代のアグリファドシステムを変革するための潜在的な課題と有望な研究機会を強調します。この調査が、この分野の新参者に全体像を提供し、さらなる研究の出発点になることを期待している。プロジェクトのWebサイトはhttps://github.com/Frenkie14/Agrifood-Surveyである。

With the world population rapidly increasing, transforming our agrifood systems to be more productive, efficient, safe, and sustainable is crucial to mitigate potential food shortages. Recently, artificial intelligence (AI) techniques such as deep learning (DL) have demonstrated their strong abilities in various areas, including language, vision, remote sensing (RS), and agrifood systems applications. However, the overall impact of AI on agrifood systems remains unclear. In this paper, we thoroughly review how AI techniques can transform agrifood systems and contribute to the modern agrifood industry. Firstly, we summarize the data acquisition methods in agrifood systems, including acquisition, storage, and processing techniques. Secondly, we present a progress review of AI methods in agrifood systems, specifically in agriculture, animal husbandry, and fishery, covering topics such as agrifood classification, growth monitoring, yield prediction, and quality assessment. Furthermore, we highlight potential challenges and promising research opportunities for transforming modern agrifood systems with AI. We hope this survey could offer an overall picture to newcomers in the field and serve as a starting point for their further research. The project website is https://github.com/Frenkie14/Agrifood-Survey.

公開日:2024-09-26
翻訳日:2024-11-09 15:13:22

# 実世界3次元シュミレーションを伴わない手の物体の3次元再構成

3D Reconstruction of Objects in Hands without Real World 3D Supervision ( http://arxiv.org/abs/2305.03036v2 )

ライセンス: Link先を確認

Aditya Prakash, Matthew Chang, Matthew Jin, Ruisen Tu, Saurabh Gupta,

(参考訳) 以前は、手持ちの物体を1枚のイメージトレインモデルから3次元形状と組み合わせた画像に再構成する作業を行っていた。このようなデータは、現実の世界で大規模に収集することは困難である。したがって、これらの手法は、新しいオブジェクトをウィジェット内で提示する際には、うまく一般化しない。 3Dの監督は大きなボトルネックだが、多岐にわたる。 a)手動物体の相互作用と映像データ b) 合成3次元形状コレクション本稿では,これらのソースから3Dインスペクションを活用するモジュールを提案し,ハンドヘルドオブジェクトの再構築のためのモデル学習をスケールアップする。具体的には、ビデオから多視点2Dマスクの監視を抽出し、形状収集から3次元形状の前兆を抽出する。我々はこれらの間接的な3次元キューを用いて、単一のRGB画像から物体の3次元形状を予測する占有ネットワークを訓練する。既存のデータセットを3Dで教師するモデルよりも11.6%の相対的な改善が見られた。

Prior works for reconstructing hand-held objects from a single image train models on images paired with 3D shapes. Such data is challenging to gather in the real world at scale. Consequently, these approaches do not generalize well when presented with novel objects in in-the-wild settings. While 3D supervision is a major bottleneck, there is an abundance of a) in-the-wild raw video data showing hand-object interactions and b) synthetic 3D shape collections. In this paper, we propose modules to leverage 3D supervision from these sources to scale up the learning of models for reconstructing hand-held objects. Specifically, we extract multiview 2D mask supervision from videos and 3D shape priors from shape collections. We use these indirect 3D cues to train occupancy networks that predict the 3D shape of objects from a single RGB image. Our experiments in the challenging object generalization setting on in-the-wild MOW dataset show 11.6% relative improvement over models trained with 3D supervision on existing datasets.

公開日:2024-09-23
翻訳日:2024-11-09 15:13:22

# 大規模言語モデルのための高速分散推論

Fast Distributed Inference Serving for Large Language Models ( http://arxiv.org/abs/2305.05920v2 )

ライセンス: Link先を確認

Bingyang Wu, Yinmin Zhong, Zili Zhang, Gang Huang, Xuanzhe Liu, Xin Jin,

(参考訳) 大規模言語モデル(LLM)は、ChatGPTで実証された対話型AIアプリケーションの新しい世代のパワーである。これらのアプリケーションのインタラクティブな性質は、LLM推論に低レイテンシを必要とする。既存のLLMサービスシステムは、ライン・オブ・ラインのブロッキングと長時間の待ち時間に悩まされる推論ジョブに対して、実行から補完処理を使用する。 LLMのための分散推論サービスシステムであるFastServeについて述べる。 FastServeはLLM推論の自己回帰パターンを利用して、各出力トークンの粒度のプリエンプションを可能にする。 FastServeはプリエンプティブスケジューリングを使用して、新しいスキップジョイントマルチレベルフィードバックキュースケジューラでレイテンシを最小限にする。 LLM推論の新たな半情報非依存設定に基づいて、スケジューラは入力長情報を利用して、到着する各ジョブに適切な初期キューを割り当てる。結合キューよりも優先度の高いキューは、削除を減らすためにスキップされる。我々は、LLM推論のためのGPUメモリとホストメモリの中間状態を積極的にオフロードし、アップロードする効率的なGPUメモリ管理機構を設計する。我々は,FastServeのシステムプロトタイプを構築し,最先端のソリューションであるvLLMと比較して,同じ平均および末尾遅延条件下でのスループットを最大31.4xと17.9xに改善したことを示す。

Large language models (LLMs) power a new generation of interactive AI applications exemplified by ChatGPT. The interactive nature of these applications demands low latency for LLM inference. Existing LLM serving systems use run-to-completion processing for inference jobs, which suffers from head-of-line blocking and long latency. We present FastServe, a distributed inference serving system for LLMs. FastServe exploits the autoregressive pattern of LLM inference to enable preemption at the granularity of each output token. FastServe uses preemptive scheduling to minimize latency with a novel skip-join Multi-Level Feedback Queue scheduler. Based on the new semi-information-agnostic setting of LLM inference, the scheduler leverages the input length information to assign an appropriate initial queue for each arrival job to join. The higher priority queues than the joined queue are skipped to reduce demotions. We design an efficient GPU memory management mechanism that proactively offloads and uploads intermediate state between GPU memory and host memory for LLM inference. We build a system prototype of FastServe and experimental results show that compared to the state-of-the-art solution vLLM, FastServe improves the throughput by up to 31.4x and 17.9x under the same average and tail latency requirements, respectively.

公開日:2024-09-25
翻訳日:2024-11-09 15:13:22

# 大規模言語モデルのための高速分散推論

Fast Distributed Inference Serving for Large Language Models ( http://arxiv.org/abs/2305.05920v3 )

ライセンス: Link先を確認

Bingyang Wu, Yinmin Zhong, Zili Zhang, Shengyu Liu, Fangyue Liu, Yuanhang Sun, Gang Huang, Xuanzhe Liu, Xin Jin,

公開日:2024-09-25
翻訳日:2024-11-09 15:13:22

# CADGE: グラフ構造化知識集約による文脈認識対話生成

CADGE: Context-Aware Dialogue Generation Enhanced with Graph-Structured Knowledge Aggregation ( http://arxiv.org/abs/2305.06294v4 )

ライセンス: Link先を確認

Hongbo Zhang, Chen Tang, Tyler Loakman, Bohao Yang, Stefan Goetze, Chenghua Lin,

(参考訳) 常識知識は多くの自然言語処理タスクに不可欠である。既存の研究は通常、グラフ知識を従来のグラフニューラルネットワーク(GNN)に組み込む。しかし、この区画化は、これらの2種類の入力知識間の文脈的相互作用を完全に活用するわけではない。本稿では,文脈対応グラフアテンションモデル (Context-aware graph-attention model) を提案する。具体的には、フラットなグラフ知識とテキストデータとを融合させることにより、不均一な特徴を調和させる表現学習に革新的なアプローチを採用する。コンテクスト情報によって補完される連結部分グラフにおけるグラフ知識集約の階層的適用により、コモンセンス駆動対話の生成を促進する。実験により,本フレームワークは従来のGNNベース言語モデルよりも性能が優れていることが示された。自動評価と人的評価の両面から,提案モデルのフローベースラインに対する性能向上が確認できた。

Commonsense knowledge is crucial to many natural language processing tasks. Existing works usually incorporate graph knowledge with conventional graph neural networks (GNNs), resulting in a sequential pipeline that compartmentalizes the encoding processes for textual and graph-based knowledge. This compartmentalization does, however, not fully exploit the contextual interplay between these two types of input knowledge. In this paper, a novel context-aware graph-attention model (Context-aware GAT) is proposed, designed to effectively assimilate global features from relevant knowledge graphs through a context-enhanced knowledge aggregation mechanism. Specifically, the proposed framework employs an innovative approach to representation learning that harmonizes heterogeneous features by amalgamating flattened graph knowledge with text data. The hierarchical application of graph knowledge aggregation within connected subgraphs, complemented by contextual information, to bolster the generation of commonsense-driven dialogues is analyzed. Empirical results demonstrate that our framework outperforms conventional GNN-based language models in terms of performance. Both, automated and human evaluations affirm the significant performance enhancements achieved by our proposed model over the concept flow baseline.

公開日:2024-09-22
翻訳日:2024-11-09 15:13:22

# 脳腫瘍セグメンテーション(BraTS)課題 : 塗布による健康な脳組織の局所的合成

The Brain Tumor Segmentation (BraTS) Challenge: Local Synthesis of Healthy Brain Tissue via Inpainting ( http://arxiv.org/abs/2305.08992v3 )

ライセンス: Link先を確認

Florian Kofler, Felix Meissen, Felix Steinbauer, Robert Graf, Stefan K Ehrlich, Annika Reinke, Eva Oswald, Diana Waldmannstetter, Florian Hoelzl, Izabela Horvath, Oezguen Turgut, Suprosanna Shit, Christina Bukas, Kaiyuan Yang, Johannes C. Paetzold, Ezequiel de da Rosa, Isra Mekki, Shankeeth Vinayahalingam, Hasan Kassem, Juexin Zhang, Ke Chen, Ying Weng, Alicia Durrer, Philippe C. Cattin, Julia Wolleb, M. S. Sadique, M. M. Rahman, W. Farzana, A. Temtam, K. M. Iftekharuddin, Maruf Adewole, Syed Muhammad Anwar, Ujjwal Baid, Anastasia Janas, Anahita Fathi Kazerooni, Dominic LaBella, Hongwei Bran Li, Ahmed W Moawad, Gian-Marco Conte, Keyvan Farahani, James Eddy, Micah Sheller, Sarthak Pati, Alexandros Karagyris, Alejandro Aristizabal, Timothy Bergquist, Verena Chung, Russell Takeshi Shinohara, Farouk Dako, Walter Wiggins, Zachary Reitman, Chunhao Wang, Xinyang Liu, Zhifan Jiang, Elaine Johanson, Zeke Meier, Ariana Familiar, Christos Davatzikos, John Freymann, Justin Kirby, Michel Bilello, Hassan M Fathallah-Shaykh, Roland Wiest, Jan Kirschke, Rivka R Colen, Aikaterini Kotrotsou, Pamela Lamontagne, Daniel Marcus, Mikhail Milchenko, Arash Nazeri, Marc-André Weber, Abhishek Mahajan, Suyash Mohan, John Mongan, Christopher Hess, Soonmee Cha, Javier Villanueva-Meyer, Errol Colak, Priscila Crivellaro, Andras Jakab, Abiodun Fatade, Olubukola Omidiji, Rachel Akinola Lagos, O O Olatunji, Goldey Khanna, John Kirkpatrick, Michelle Alonso-Basanta, Arif Rashid, Miriam Bornhorst, Ali Nabavizadeh, Natasha Lepore, Joshua Palmer, Antonio Porras, Jake Albrecht, Udunna Anazodo, Mariam Aboian, Evan Calabrese, Jeffrey David Rudie, Marius George Linguraru, Juan Eugenio Iglesias, Koen Van Leemput, Spyridon Bakas, Benedikt Wiestler, Ivan Ezhov, Marie Piraud, Bjoern H Menze,

(参考訳) 脳MR画像の自動解析のための無数のアルゴリズムが、臨床医の意思決定を支援するために利用可能である。脳腫瘍患者の場合、画像取得の時系列は通常、すでに病理的なスキャンから始まる。多くのアルゴリズムは健康な脳を解析し、病変を特徴とする画像の保証を提供しない。例えば、脳解剖学のパーセレーション、組織セグメンテーション、脳抽出のアルゴリズムがある。このジレンマを解決するために,BraTS塗装の課題を紹介する。そこで参加者は、損傷した脳から健康な脳スキャンを合成するための塗装技術を探る。下記の原稿にはタスクの定式化、データセット、提出手順が含まれている。その後、課題の調査結果をまとめるために更新される。この挑戦はASNR-BraTS MICCAIチャレンジの一部として組織されている。

A myriad of algorithms for the automatic analysis of brain MR images is available to support clinicians in their decision-making. For brain tumor patients, the image acquisition time series typically starts with an already pathological scan. This poses problems, as many algorithms are designed to analyze healthy brains and provide no guarantee for images featuring lesions. Examples include, but are not limited to, algorithms for brain anatomy parcellation, tissue segmentation, and brain extraction. To solve this dilemma, we introduce the BraTS inpainting challenge. Here, the participants explore inpainting techniques to synthesize healthy brain scans from lesioned ones. The following manuscript contains the task formulation, dataset, and submission procedure. Later, it will be updated to summarize the findings of the challenge. The challenge is organized as part of the ASNR-BraTS MICCAI challenge.

公開日:2024-09-22
翻訳日:2024-11-09 15:13:22

# 教師なし要約の最近の動向

Recent Trends in Unsupervised Summarization ( http://arxiv.org/abs/2305.11231v2 )

ライセンス: Link先を確認

Mohammad Khosravani, Amine Trabelsi,

(参考訳) 教師なしの要約は、ラベル付きデータセットを必要とせずにモデルを要約する訓練を可能にする強力なテクニックである。このサーベイは、教師なし要約に使用される様々な手法とモデルをカバーしている。我々は、教師なし要約を実現するために用いられる抽出的、抽象的、ハイブリッドなモデルと戦略を網羅する。この調査の主な焦点は最近の研究であるが、過去の重要な研究についても紹介する。さらに分類学を導入し、教師なしトレーニングへのアプローチに基づいて異なる研究を分類する。最後に、現在のアプローチについて議論し、いくつかのデータセットと評価手法について述べる。

Unsupervised summarization is a powerful technique that enables training summarizing models without requiring labeled datasets. This survey covers different recent techniques and models used for unsupervised summarization. We cover extractive, abstractive, and hybrid models and strategies used to achieve unsupervised summarization. While the main focus of this survey is on recent research, we also cover some of the important previous research. We additionally introduce a taxonomy, classifying different research based on their approach to unsupervised training. Finally, we discuss the current approaches and mention some datasets and evaluation methods.

公開日:2024-09-26
翻訳日:2024-11-09 15:13:22

# 言語モデルに追従する: バイアス監査のためのシステムベンチマーク拡張

Keeping Up with the Language Models: Systematic Benchmark Extension for Bias Auditing ( http://arxiv.org/abs/2305.12620v2 )

ライセンス: Link先を確認

Ioana Baldini, Chhavi Yadav, Manish Nagireddy, Payel Das, Kush R. Varshney,

(参考訳) 言語モデル (LM) のバイアス監査は, LM が普及するにつれて注目されている。このように、バイアス監査のためのいくつかのベンチマークが提案されている。同時に、LMの急速な進化は、これらのベンチマークをすぐに無関係にすることができる。バイアス監査は、LMの脆性によってさらに複雑である: おそらくバイアスのある結果が観察された場合、それはモデルバイアスかモデル脆性によるものか? モデル自体を登録して、困難なままのバイアス監査データセットの構築を支援し、異なるタイプのモデルエラーを区別するバイアス測定を導入することを提案する。まず,NLI(BBNLI)の既存のバイアスベンチマークを,LM生成語彙の変動,逆フィルタリング,人間による検証の組み合わせを用いて拡張する。 BBNLI-nextは平均して最先端のNLIモデルの精度を95.3%から57.5%に下げる。次に、BBNLI-nextを用いて、ロバスト性とバイアスの相互作用を示す。現在のバイアススコアの欠点を指摘し、バイアスとモデルの脆さを考慮に入れたバイアス対策を提案する。第三に、BBNLI-nextは非生成モデルを念頭に設計されているにもかかわらず、新しいデータセットは、最先端のオープンソース生成LMのバイアスを明らかにすることが可能であることを示す。注: この研究に含まれるすべてのデータセットは英語で書かれており、米国中心の社会的偏見に対処している。効率的なNLP研究の精神において、この研究を行うためのモデルトレーニングや微調整は行われなかった。警告: 攻撃的なテキスト例を含む。

Bias auditing of language models (LMs) has received considerable attention as LMs are becoming widespread. As such, several benchmarks for bias auditing have been proposed. At the same time, the rapid evolution of LMs can make these benchmarks irrelevant in no time. Bias auditing is further complicated by LM brittleness: when a presumably biased outcome is observed, is it due to model bias or model brittleness? We propose enlisting the models themselves to help construct bias auditing datasets that remain challenging, and introduce bias measures that distinguish between different types of model errors. First, we extend an existing bias benchmark for NLI (BBNLI) using a combination of LM-generated lexical variations, adversarial filtering, and human validation. We demonstrate that the newly created dataset BBNLI-next is more challenging than BBNLI: on average, BBNLI-next reduces the accuracy of state-of-the-art NLI models from 95.3%, as observed by BBNLI, to a strikingly low 57.5%. Second, we employ BBNLI-next to showcase the interplay between robustness and bias: we point out shortcomings in current bias scores and propose bias measures that take into account both bias and model brittleness. Third, despite the fact that BBNLI-next was designed with non-generative models in mind, we show that the new dataset is also able to uncover bias in state-of-the-art open-source generative LMs. Note: All datasets included in this work are in English and they address US-centered social biases. In the spirit of efficient NLP research, no model training or fine-tuning was performed to conduct this research. Warning: This paper contains offensive text examples.

公開日:2024-09-25
翻訳日:2024-11-09 15:13:22

# GUARD: 安全な強化学習ベンチマーク

GUARD: A Safe Reinforcement Learning Benchmark ( http://arxiv.org/abs/2305.13681v4 )

ライセンス: Link先を確認

Weiye Zhao, Yifan Sun, Feihan Li, Rui Chen, Ruixuan Liu, Tianhao Wei, Changliu Liu,

(参考訳) 試行錯誤の性質のため、そのようなエラーが許容できない自律運転、人間とロボットのインタラクション、ロボット操作など、安全クリティカルな現実世界のアプリケーションにRLアルゴリズムを適用することは、一般的に困難である。近年、安全なRL(すなわち制約付きRL)は、制約を満たすとともに、エージェントが環境を探索する文献に急速に現れている。アルゴリズムとタスクの多様性のため、既存の安全なRLアルゴリズムを比較するのは難しい。このギャップを埋めるために、一般化されたSAfe強化学習ベンチマークであるGUARDを紹介します。 GUARDは既存のベンチマークと比べていくつかの利点がある。まず、GUARDは様々なRLエージェント、タスク、安全制約仕様を備えた一般化されたベンチマークである。第2に、GUARDは自己完結した実装で最先端の安全なRLアルゴリズムを包括的にカバーしている。第3に、GUARDはタスクやアルゴリズムで高度にカスタマイズできる。本稿では,GUARDを用いた各種タスク設定における最先端安全RLアルゴリズムの比較を行い,今後の作業が構築できるベースラインを確立する。

Due to the trial-and-error nature, it is typically challenging to apply RL algorithms to safety-critical real-world applications, such as autonomous driving, human-robot interaction, robot manipulation, etc, where such errors are not tolerable. Recently, safe RL (i.e. constrained RL) has emerged rapidly in the literature, in which the agents explore the environment while satisfying constraints. Due to the diversity of algorithms and tasks, it remains difficult to compare existing safe RL algorithms. To fill that gap, we introduce GUARD, a Generalized Unified SAfe Reinforcement Learning Development Benchmark. GUARD has several advantages compared to existing benchmarks. First, GUARD is a generalized benchmark with a wide variety of RL agents, tasks, and safety constraint specifications. Second, GUARD comprehensively covers state-of-the-art safe RL algorithms with self-contained implementations. Third, GUARD is highly customizable in tasks and algorithms. We present a comparison of state-of-the-art safe RL algorithms in various task settings using GUARD and establish baselines that future work can build on.

公開日:2024-09-24
翻訳日:2024-11-09 15:02:22

# フォノンによる巨大物体を持つ空間量子重ね合わせの極限

Limit on spatial quantum superpositions with massive objects due to phonons ( http://arxiv.org/abs/2305.15230v2 )

ライセンス: Link先を確認

Carsten Henkel, Ron Folman,

(参考訳) 巨大な物体を実空間の異なる位置の重ね合わせに持ち込むことは長年の目標であり、新しい状態における量子理論を確かめるだけでなく、重力との界面を探索することでもある。主な課題は通常、大きな物体の波動関数を統計的混合に分解する環境場や粒子による力や散乱によって生じると考えられている。環境からの隔離の改善によって除去できないデコヒーレンスチャネルを公表する。これは物体内の音波から派生したもので、任意の分裂過程の一部として励起され、部分的な「ヴェルチャー・ウェッグ」情報を運ぶ。これにより、大きな物体の将来の空間重ね合わせに厳密な制約が課される。

It has been a long-standing goal to bring massive objects into a superposition of different locations in real space, not only to confirm quantum theory in new regimes, but also to explore the interface with gravity. The main challenge is usually thought to arise from forces or scattering due to environmental fields and particles that decohere the large object's wave function into a statistical mixture. We unveil a decoherence channel which cannot be eliminated by improved isolation from the environment. It originates from sound waves within the object, which are excited as part of any splitting process and carry partial "Welcher Weg" information. This puts stringent constraints on future spatial superpositions of large objects.

公開日:2024-09-27
翻訳日:2024-11-09 15:02:22

# Taylorformer: 時系列を含むランダムプロセスの確率論的モデリング

Taylorformer: Probabilistic Modelling for Random Processes including Time Series ( http://arxiv.org/abs/2305.19141v2 )

ライセンス: Link先を確認

Omer Nivron, Raghul Parthipan, Damon J. Wischik,

(参考訳) 時系列などのランダムなプロセスに対してTaylorformerを提案する。その2つの重要な構成要素は以下のとおりである。 1) ニューラルネットワークに基づく確率モデルにおけるTaylor近似(力学系で使用される)を適応するLocalTaylorラッパー 2) ガウス過程の平均予測が文脈データの線形滑らか化にどのように影響するかに着想を得たMHA-Xアテンションブロック。 Taylorformerは、メタラーニング1D機能のような5/6の古典的なニューラル・プロセスのタスクで、ログライクな点では最先端のタスクを上回り、電気、油温、為替レートなどの予測タスクでは、少なくとも14倍のMSEを改善している。 Taylorformerは、一貫した確率過程を近似し、不確実性を考慮した予測を提供する。私たちのコードは補足材料で提供されます。

We propose the Taylorformer for random processes such as time series. Its two key components are: 1) the LocalTaylor wrapper which adapts Taylor approximations (used in dynamical systems) for use in neural network-based probabilistic models, and 2) the MHA-X attention block which makes predictions in a way inspired by how Gaussian Processes' mean predictions are linear smoothings of contextual data. Taylorformer outperforms the state-of-the-art in terms of log-likelihood on 5/6 classic Neural Process tasks such as meta-learning 1D functions, and has at least a 14\% MSE improvement on forecasting tasks, including electricity, oil temperatures and exchange rates. Taylorformer approximates a consistent stochastic process and provides uncertainty-aware predictions. Our code is provided in the supplementary material.

公開日:2024-09-23
翻訳日:2024-11-09 15:02:22

# プライバシ保護による会計認証:ユニバーサルログインのためのLarchシステム

Accountable authentication with privacy protection: The Larch system for universal login ( http://arxiv.org/abs/2305.19241v8 )

ライセンス: Link先を確認

Emma Dauterman, Danny Lin, Henry Corrigan-Gibbs, David Mazières,

(参考訳) クレデンシャル妥協は検出が難しく、緩和が難しい。この問題に対処するために,強力なセキュリティとプライバシ特性を備えた説明可能な認証フレームワークであるlarchを提案する。 Larchはユーザのプライバシを保護し、larchログサーバがすべての認証を正しく記録することを保証する。具体的には、ユーザのデバイスを侵害した攻撃者は、ログに証拠を作成せずに認証することができず、ログは、ユーザが認証しているWebサービス(サードパーティ)を学習することはできない。迅速な採用を実現するため、larchはFIDO2、TOTP、パスワードベースのログインをサポートするサードパーティと後方互換性がある。さらに、larchは、ユーザがすでに期待しているセキュリティとプライバシを劣化させません。ログサーバは、ユーザに代わって認証することができません。 FIDO2、TOTP、パスワードベースのログインのためのlarchを実装している。 4コアのクライアントと8コアのログサーバが与えられた後、larchによる認証はFIDO2で150ms、TOTPで91ms、パスワードで74ms(TOTPで1.23s)。

Credential compromise is hard to detect and hard to mitigate. To address this problem, we present larch, an accountable authentication framework with strong security and privacy properties. Larch protects user privacy while ensuring that the larch log server correctly records every authentication. Specifically, an attacker who compromises a user's device cannot authenticate without creating evidence in the log, and the log cannot learn which web service (relying party) the user is authenticating to. To enable fast adoption, larch is backwards-compatible with relying parties that support FIDO2, TOTP, and password-based login. Furthermore, larch does not degrade the security and privacy a user already expects: the log server cannot authenticate on behalf of a user, and larch does not allow relying parties to link a user across accounts. We implement larch for FIDO2, TOTP, and password-based login. Given a client with four cores and a log server with eight cores, an authentication with larch takes 150ms for FIDO2, 91ms for TOTP, and 74ms for passwords (excluding preprocessing, which takes 1.23s for TOTP).

公開日:2024-09-23
翻訳日:2024-11-09 15:02:22

# 絡み合うコンパスとしてのグリュナイゼンパラメータとヘルマン・ファインマンの定理の分解

Grüneisen parameter as an entanglement compass and the breakdown of the Hellmann-Feynman theorem ( http://arxiv.org/abs/2306.00566v2 )

ライセンス: Link先を確認

Lucas Squillante, Luciano S. Ricco, Aniekan Magnus Ukpong, Roberto E. Lagos-Monaco, Antonio C. Seridonio, Mariano de Souza,

(参考訳) Gr\"uneisen ratio $\Gamma$, すなわち、熱膨張と比熱の比の特異部分は、有限のT$と量子臨界点(QCP)の両方を探索するために広く用いられている。真の量子相転移(QPT)では、熱ゆらぎが欠如しており、熱力学的な$\Gamma$は使用できない。チューニングパラメータ $\lambda$ の関数として絡み合いを計算する$\Gamma$ の量子アナログを提案し、基底状態エネルギーが非直線的に$\lambda$ に依存するシステムに対してのみ QPT が実行されることを示す。さらに、任意のQCPにおける熱力学極限におけるヘルマン・ファインマンの定理の分解を実証する。本稿では,逆場をもつ量子1次元イジングモデルとケーンの量子コンピュータを用いたアプローチを紹介する。ダイナミクスの減速と、QCP/QPTに近い「質量の創出」についても論じる。

The Gr\"uneisen ratio $\Gamma$, i.e., the singular part of the ratio of thermal expansion to the specific heat, has been broadly employed to explore both finite-$T$ and quantum critical points (QCPs). For a genuine quantum phase transition (QPT), thermal fluctuations are absent and thus the thermodynamic $\Gamma$ cannot be employed. We propose a quantum analogue to $\Gamma$ that computes entanglement as a function of a tuning parameter $\lambda$ and show that QPTs take place only for systems in which the ground-state energy depends on $\lambda$ non-linearly. Furthermore, we demonstrate the breakdown of the Hellmann-Feynman theorem in the thermodynamic limit at any QCP. We showcase our approach using the quantum 1D Ising model with transverse field and Kane's quantum computer. The slowing down of the dynamics and thus the "creation of mass" close to any QCP/QPT is also discussed.

公開日:2024-09-25
翻訳日:2024-11-09 15:02:22

# 弦理論における有限エンタングルメントエントロピー

Finite Entanglement Entropy in String Theory ( http://arxiv.org/abs/2306.00990v2 )

ライセンス: Link先を確認

Atish Dabholkar, Upamanyu Moitra,

(参考訳) 我々は、10次元のタイプII弦理論における1ループの量子エンタングルメントエントロピーを、任意の奇数の整数$N > 1$で知られている$\mathbb{R}^2/\mathbb{Z}_N$の弦オービフォールドに対する属1分割関数を解析的に$N$で連続させることにより解析する。オービフォールド分割関数に対するタキオン寄与は、物理的領域 $0 < N \leq 1$ において有限である式に適切にまとめ、解析的に連続し、エンタングルメントエントロピーに対する有限で計算可能な解が得られることを示す。情報パラドックス,量子重力,ホログラフィーにおけるエンタングルメントエントロピーの有限性の影響について論じる。

We analyze the one-loop quantum entanglement entropy in ten-dimensional Type-II string theory using the orbifold method by analytically continuing in $N$ the genus-one partition function for string orbifolds on $\mathbb{R}^2/\mathbb{Z}_N$ conical spaces known for all odd integers $N > 1$. We show that the tachyonic contributions to the orbifold partition function can be appropriately summed and analytically continued to an expression that is finite in the physical region $0 < N \leq 1$ resulting in a finite and calculable answer for the entanglement entropy. We discuss the implications of the finiteness of the entanglement entropy for the information paradox, quantum gravity, and holography.

公開日:2024-09-24
翻訳日:2024-11-09 15:02:22

# SelFLoc: 大規模クラウドによる位置認識のための選択的特徴融合

SelFLoc: Selective Feature Fusion for Large-scale Point Cloud-based Place Recognition ( http://arxiv.org/abs/2306.01205v3 )

ライセンス: Link先を確認

Qibo Qiu, Wenxiao Wang, Haochao Ying, Dingkun Liang, Haiming Gao, Xiaofei He,

(参考訳) ポイントクラウドベースの位置認識は、特にグローバルな位置センサがアクセスできない場合、モバイルロボットや自動運転車にとって不可欠である。物体や建物の表面にはLiDARの点が散在しており、異なる軸に沿って強い形状の先行している。特定の軸に沿ったメッセージパッシングを改善するために,本論文の主なコントリビューションのひとつとして,スタック型非対称畳み込みブロック(SACB)が設計されている。総合的な実験により、SACBが採用した非対称な畳み込みとその戦略が、ポイントクラウドの特徴のより効果的な表現に寄与できることが示されている。そこで,SFFB (Selective Feature Fusion Block) は,特定の鍵領域の局所的特徴を選択的に増強し,融合前の特徴を整列させる。 SACBとSFFBは、SelFLocと呼ばれるポイントクラウドベースの位置認識のための堅牢で正確なアーキテクチャを構築するために結合される。比較実験の結果,SelFLoc は,平均リコール@1。

Point cloud-based place recognition is crucial for mobile robots and autonomous vehicles, especially when the global positioning sensor is not accessible. LiDAR points are scattered on the surface of objects and buildings, which have strong shape priors along different axes. To enhance message passing along particular axes, Stacked Asymmetric Convolution Block (SACB) is designed, which is one of the main contributions in this paper. Comprehensive experiments demonstrate that asymmetric convolution and its corresponding strategies employed by SACB can contribute to the more effective representation of point cloud feature. On this basis, Selective Feature Fusion Block (SFFB), which is formed by stacking point- and channel-wise gating layers in a predefined sequence, is proposed to selectively boost salient local features in certain key regions, as well as to align the features before fusion phase. SACBs and SFFBs are combined to construct a robust and accurate architecture for point cloud-based place recognition, which is termed SelFLoc. Comparative experimental results show that SelFLoc achieves the state-of-the-art (SOTA) performance on the Oxford and other three in-house benchmarks with an improvement of 1.6 absolute percentages on mean average recall@1.

公開日:2024-09-24
翻訳日:2024-11-09 15:02:22

# 漸近テンソルランクの離散性

Discreteness of asymptotic tensor ranks ( http://arxiv.org/abs/2306.01718v3 )

ライセンス: Link先を確認

Jop Briët, Matthias Christandl, Itai Leigh, Amir Shpilka, Jeroen Zuiddam,

(参考訳) テンソルパラメータは、しばしば「漸近的」テンソルパラメータと呼ばれ、代数的複雑性理論(高速行列乗算アルゴリズムの構築)、量子情報(絡み合いコストと蒸留可能な絡み合い)、加法的コンビネータ(キャップセット、サンフラワーフリーセットなど)を含むいくつかの領域において中心的な役割を果たす。例えば、漸近テンソルランク、漸近スライスランク、漸近サブランクである。最近の研究 (Costa-Dalai, Blatter-Draisma-Rupniewski, Christandl-Gesmundo-Zuiddam) では、そのようなテンソルパラメータの値における離散性(累積点を持たない)や「ギャップ」の概念が研究されている。我々は、次数3テンソルの漸近テンソルパラメータに対する一般的な離散性定理を証明し、これを、(1)任意の有限体(実際、任意の体における係数の有限集合)、漸近部分ランクおよび漸近スライスランクが累積点を持たないこと、(2)複素数上では、漸近スライスランクが累積点を持たないことを証明するために利用する。我々のアプローチの中心はテンソルの漸近部分ランクの2つの新しい一般下界であり、テンソルがどれだけ対角化できるかを測定する。最初の下界は、任意の簡潔な3次元テンソルの漸近部分ランクは、少なくとも最小次元の立方根であると述べている。 2番目の下界は、「十分狭く」(他の2つよりも1次元がかなり小さい)任意の簡潔な3つのテンソルは、最大漸近部分ランクを持つと述べている。我々の証明は、行列部分空間の最大階数に対する新しい下界に依存し、3つの異なる方向に3つのテンソルをスライスすることで得られる。任意の簡潔なテンソルに対して、そのような最大ランクの任意の2つの積は大きいものでなければならないことを証明し、その結果、常に大きな最大ランクを持つ2つの異なる方向が存在する。

Tensor parameters that are amortized or regularized over large tensor powers, often called "asymptotic" tensor parameters, play a central role in several areas including algebraic complexity theory (constructing fast matrix multiplication algorithms), quantum information (entanglement cost and distillable entanglement), and additive combinatorics (bounds on cap sets, sunflower-free sets, etc.). Examples are the asymptotic tensor rank, asymptotic slice rank and asymptotic subrank. Recent works (Costa-Dalai, Blatter-Draisma-Rupniewski, Christandl-Gesmundo-Zuiddam) have investigated notions of discreteness (no accumulation points) or "gaps" in the values of such tensor parameters. We prove a general discreteness theorem for asymptotic tensor parameters of order-three tensors and use this to prove that (1) over any finite field (and in fact any finite set of coefficients in any field), the asymptotic subrank and the asymptotic slice rank have no accumulation points, and (2) over the complex numbers, the asymptotic slice rank has no accumulation points. Central to our approach are two new general lower bounds on the asymptotic subrank of tensors, which measures how much a tensor can be diagonalized. The first lower bound says that the asymptotic subrank of any concise three-tensor is at least the cube-root of the smallest dimension. The second lower bound says that any concise three-tensor that is "narrow enough" (has one dimension much smaller than the other two) has maximal asymptotic subrank. Our proofs rely on new lower bounds on the maximum rank in matrix subspaces that are obtained by slicing a three-tensor in the three different directions. We prove that for any concise tensor, the product of any two such maximum ranks must be large, and as a consequence there are always two distinct directions with large max-rank.

公開日:2024-09-24
翻訳日:2024-11-09 15:02:22

# 線形文脈による探索のインセンティブと組合せ行動

Incentivizing Exploration with Linear Contexts and Combinatorial Actions ( http://arxiv.org/abs/2306.01990v3 )

ライセンス: Link先を確認

Mark Sellke,

(参考訳) 我々は、腕の選択を推奨とみなし、ベイズ的インセンティブと互換性を持たなければならない、インセンティブ付きバンディット探索の研究を前進させる。最近の研究は、十分な初期サンプルを収集した後、人気のあるトンプソンサンプリングアルゴリズムがインセンティブ互換になる、という一定の独立性の仮定の下で示されている。線形包帯に対してこの結果の類似性を与え、そこでは前者の独立性を自然凸条件に置き換える。これにより、高次元の行動空間における効率的かつ後悔に満ちたインセンティブ付き探索の可能性が開ける。半帯域モデルでは、初期データ収集のトンプソン前サンプリングフェーズにおけるサンプルの複雑さも改善する。

We advance the study of incentivized bandit exploration, in which arm choices are viewed as recommendations and are required to be Bayesian incentive compatible. Recent work has shown under certain independence assumptions that after collecting enough initial samples, the popular Thompson sampling algorithm becomes incentive compatible. We give an analog of this result for linear bandits, where the independence of the prior is replaced by a natural convexity condition. This opens up the possibility of efficient and regret-optimal incentivized exploration in high-dimensional action spaces. In the semibandit model, we also improve the sample complexity for the pre-Thompson sampling phase of initial data collection.

公開日:2024-09-24
翻訳日:2024-11-09 15:02:22

# 合成能動推論エージェントの実現その2: 変分メッセージ更新

Realising Synthetic Active Inference Agents, Part II: Variational Message Updates ( http://arxiv.org/abs/2306.02733v3 )

ライセンス: Link先を確認

Thijs van de Laar, Magnus Koudahl, Bert de Vries,

(参考訳) 自由エネルギー原則(FEP)は、(生物学的)エージェントを、環境の生成モデルに関する変動自由エネルギー(FE)を最小化するものとして記述している。アクティブ推論(英: Active Inference、AIF)は、エージェントが期待されるFE目標を最小化することによって環境を探索し、活用する方法を記述するFEPのまとめである。 2つの関連論文において、自由形式のForney-style Factor Graphs (FFGs) 上のメッセージパッシングによるAIFのスケーラブルでエピステマティックなアプローチについて述べる。共用紙(第1部)は、AFFのFE目標を視覚的に(一般化)する制約付きFFG(CFFG)表記法を導入する。現在の論文(パートII)は、変分法によりCFFG上のFE目的を最小化(一般化)するメッセージパッシングアルゴリズムを導出する。シミュレーションされたBetheと一般化されたFEエージェントの比較は、合成AIFへのメッセージパッシングアプローチがT迷路ナビゲーションタスクにおいてどのようにててんかん行動を引き起こすかを示している。 T迷路シミュレーションの拡張 1)目標統計の学習、及び 2)マルチエージェントバーゲティング設定は、このアプローチがノードの再利用と代替設定の更新をいかに促すかを示している。合成AIFエージェントの完全なメッセージパッシングアカウントにより、モデル間でのメッセージ更新を導出し再利用し、合成AIFの産業的応用に近づくことができる。

The Free Energy Principle (FEP) describes (biological) agents as minimising a variational Free Energy (FE) with respect to a generative model of their environment. Active Inference (AIF) is a corollary of the FEP that describes how agents explore and exploit their environment by minimising an expected FE objective. In two related papers, we describe a scalable, epistemic approach to synthetic AIF, by message passing on free-form Forney-style Factor Graphs (FFGs). A companion paper (part I) introduces a Constrained FFG (CFFG) notation that visually represents (generalised) FE objectives for AIF. The current paper (part II) derives message passing algorithms that minimise (generalised) FE objectives on a CFFG by variational calculus. A comparison between simulated Bethe and generalised FE agents illustrates how the message passing approach to synthetic AIF induces epistemic behaviour on a T-maze navigation task. Extension of the T-maze simulation to 1) learning goal statistics, and 2) a multi-agent bargaining setting, illustrate how this approach encourages reuse of nodes and updates in alternative settings. With a full message passing account of synthetic AIF agents, it becomes possible to derive and reuse message updates across models and move closer to industrial applications of synthetic AIF.

公開日:2024-09-26
翻訳日:2024-11-09 15:02:22

# 最適木アンサンブルの計算について

On Computing Optimal Tree Ensembles ( http://arxiv.org/abs/2306.04423v2 )

ライセンス: Link先を確認

Christian Komusiewicz, Pascal Kunz, Frank Sommer, Manuel Sorge,

(参考訳) ランダム林や、より一般的には(決定的)ノブレイクダッシュ-(ツリーアンサンブル)は、分類と回帰の方法として広く使われている。最近のアルゴリズムの進歩は、そのサイズや深さなどの様々な測定に最適な決定木を計算することができる。我々は、このような樹木アンサンブルの研究を意識しておらず、この領域に貢献することを目指している。主に、2つの新しいアルゴリズムと対応する下位境界を提供する。まず、決定木に対するトラクタビリティーの結果を大幅に改善することができる: トレーニングデータセットとサイズが有界な$S \in \mathbb{R}$を与えられた場合、最大で$S$でツリーアンサンブルを計算し、データを正しく分類するアルゴリズムを得る。このアルゴリズムは$(4\delta D S)^S \cdot poly$-timeで実行され、$D$は最大のドメインサイズ、$\delta$は2つの異なる例、$n$は入力例、$poly$は入力サイズの多項式である。決定木、すなわち、サイズ1のアンサンブルに対して、$(\delta D s)^s \cdot poly$ のランニング時間を得る。これらのアルゴリズムを実現するために,実践的な実装に期待できる目撃者木技術を導入する。第2に、決定木にうまく適用された動的プログラミングは、木アンサンブルにも有効である可能性を示し、$\ell^n \cdot poly$-timeアルゴリズムを提供し、$\ell$は木数である。最後に、決定木と木アンサンブルのトレーニングデータセットの分類に必要なカット数を比較し、アンサンブルが木数の増加に指数関数的に少ないカットを必要とすることを示す。

Random forests and, more generally, (decision\nobreakdash-)tree ensembles are widely used methods for classification and regression. Recent algorithmic advances allow to compute decision trees that are optimal for various measures such as their size or depth. We are not aware of such research for tree ensembles and aim to contribute to this area. Mainly, we provide two novel algorithms and corresponding lower bounds. First, we are able to carry over and substantially improve on tractability results for decision trees: We obtain an algorithm that, given a training-data set and an size bound $S \in \mathbb{R}$, computes a tree ensemble of size at most $S$ that classifies the data correctly. The algorithm runs in $(4\delta D S)^S \cdot poly$-time, where $D$ the largest domain size, $\delta$ is the largest number of features in which two examples differ, $n$ the number of input examples, and $poly$ a polynomial of the input size. For decision trees, that is, ensembles of size 1, we obtain a running time of $(\delta D s)^s \cdot poly$, where $s$ is the size of the tree. To obtain these algorithms, we introduce the witness-tree technique, which seems promising for practical implementations. Secondly, we show that dynamic programming, which has been applied successfully to computing decision trees, may also be viable for tree ensembles, providing an $\ell^n \cdot poly$-time algorithm, where $\ell$ is the number of trees. Finally, we compare the number of cuts necessary to classify training data sets for decision trees and tree ensembles, showing that ensembles may need exponentially fewer cuts for increasing number of trees.

公開日:2024-09-24
翻訳日:2024-11-09 15:02:22

# Etsy Searchにおける統一埋め込みに基づくパーソナライズされた検索

Unified Embedding Based Personalized Retrieval in Etsy Search ( http://arxiv.org/abs/2306.04833v2 )

ライセンス: Link先を確認

Rishikesh Jha, Siddharth Subramaniyam, Ethan Benjamin, Thrivikrama Taula,

(参考訳) 埋め込みベースのニューラル検索は、末尾クエリの製品検索でしばしば発生するセマンティックギャップ問題に対処するための一般的なアプローチである。対照的に、一般的なクエリにはコンテキストが欠如しており、ユーザの過去のインタラクションから追加のコンテキストが役に立つような、幅広い意図がある。本稿では、セマンティックギャップ問題と、パーソナライズされたセマンティック検索のためのエンド・ツー・エンド・トレーニングモデルの両方に対処する新しいアプローチを共有する。グラフ, トランスフォーマー, 項ベースの埋め込みを終端から終端まで組み込んだ統合埋め込みモデルを学習し, 性能と効率の最適なトレードオフのための設計選択を共有することを提案する。我々は、機能工学、ハードネガティブサンプリング戦略、トランスフォーマーモデルの適用に関する知見を共有し、新しい事前学習戦略や、検索関連性を改善し、そのようなモデルを産業規模で展開するための他の手法を含む。我々のパーソナライズされた検索モデルは、検索購入率の5.58%、サイト全体のコンバージョン率の2.63%、複数のA/Bテストにまたがるライブトラフィックにおいて、検索体験を著しく改善する。

Embedding-based neural retrieval is a prevalent approach to address the semantic gap problem which often arises in product search on tail queries. In contrast, popular queries typically lack context and have a broad intent where additional context from users historical interaction can be helpful. In this paper, we share our novel approach to address both: the semantic gap problem followed by an end to end trained model for personalized semantic retrieval. We propose learning a unified embedding model incorporating graph, transformer and term-based embeddings end to end and share our design choices for optimal tradeoff between performance and efficiency. We share our learnings in feature engineering, hard negative sampling strategy, and application of transformer model, including a novel pre-training strategy and other tricks for improving search relevance and deploying such a model at industry scale. Our personalized retrieval model significantly improves the overall search experience, as measured by a 5.58% increase in search purchase rate and a 2.63% increase in site-wide conversion rate, aggregated across multiple A/B tests - on live traffic.

公開日:2024-09-25
翻訳日:2024-11-09 15:02:22

# Mnemonic Codeによるワンショット機械の学習

One-Shot Machine Unlearning with Mnemonic Code ( http://arxiv.org/abs/2306.05670v2 )

ライセンス: Link先を確認

Tomoya Yamashita, Masanori Yamada, Takashi Shibata,

(参考訳) 人工知能(AI)アプリケーションに固有の倫理的およびプライバシー上の問題は、ディープラーニングの急速な普及に対する懸念が高まっている。機械学習(MU)は、トレーニングされたAIモデルを望ましくないトレーニングデータを忘れさせることによって、これらの問題に対処する研究領域である。残念なことに、既存のMUメソッドの多くは、忘れるのにかなりの時間と計算コストを必要とする。したがって、これらの手法を実用的なデータセットや高度なアーキテクチャ、例えば ImageNet や Transformer に適用することは、しばしば困難である。この問題に対処するために,軽量かつ効率的なMU法を提案する。本手法は, 忘れる対象に敏感なモデルパラメータを同定し, モデルパラメータに摂動を追加する。本稿では,FIM(Fisher Information Matrix)を計算し,その感度パラメータを同定する。このアプローチでは、忘れるのに時間を要する追加のトレーニングは必要ありません。さらに,Mnemonic codeと呼ばれるクラス固有のランダム信号を導入し,FIM計算のコストを削減する。本手法では, ムネモニック符号を用いてモデルを訓練し, ムネモニック符号を少数使用してFIMを計算し, 効率的に摂動し, 忘れる。包括的実験により,本手法は既存のMU法よりも高速で,忘れやすいことが示された。さらに,本手法は,より実用的なデータセットや高度なアーキテクチャに拡張可能であることを示す。

Ethical and privacy issues inherent in artificial intelligence (AI) applications have been a growing concern with the rapid spread of deep learning. Machine unlearning (MU) is the research area that addresses these issues by making a trained AI model forget about undesirable training data. Unfortunately, most existing MU methods incur significant time and computational costs for forgetting. Therefore, it is often difficult to apply these methods to practical datasets and sophisticated architectures, e.g., ImageNet and Transformer. To tackle this problem, we propose a lightweight and effective MU method. Our method identifies the model parameters sensitive to the forgetting targets and adds perturbation to such model parameters. We identify the sensitive parameters by calculating the Fisher Information Matrix (FIM). This approach does not require time-consuming additional training for forgetting. In addition, we introduce class-specific random signals called mnemonic code to reduce the cost of FIM calculation, which generally requires the entire training data and incurs significant computational costs. In our method, we train the model with mnemonic code; when forgetting, we use a small number of mnemonic codes to calculate the FIM and get the effective perturbation for forgetting. Comprehensive experiments demonstrate that our method is faster and better at forgetting than existing MU methods. Furthermore, we show that our method can scale to more practical datasets and sophisticated architectures.

公開日:2024-09-25
翻訳日:2024-11-09 15:02:22

# CCE:信頼度制御によるロボットナビゲーションのための効率的なスパースリワード政策学習

CCE: Sample Efficient Sparse Reward Policy Learning for Robotic Navigation via Confidence-Controlled Exploration ( http://arxiv.org/abs/2306.06192v8 )

ライセンス: Link先を確認

Bhrij Patel, Kasun Weerakoon, Wesley A. Suttle, Alec Koppel, Brian M. Sadler, Tianyi Zhou, Amrit Singh Bedi, Dinesh Manocha,

(参考訳) 本稿では,ロボットナビゲーションなどのスパース報酬設定のための強化学習(RL)アルゴリズムのトレーニングサンプル効率を高めるための新しい探索手法である信頼性制御探索(CCE)を紹介する。スパース報酬はRLで一般的であり、設計と実装に便利であるが、探索の課題のために対処するのが通常困難である。既存の手法では、探索課題に対処するための正規化ベースの手法が展開されている。しかし、正規化は報酬関数自体を変更するため、探索と搾取のバランスを特徴付けることは困難である。既存の文献における正規化に基づくアプローチとは対照的に、我々のアプローチであるCCEは、勾配推定と政策エントロピーの間の新しい関係に基づいている。 CCEは、探索を制御するために訓練中に使用される勾配更新のサンプル数を動的に調整する。興味深いことに、CCEは既存のオン・ポリティクスとオフ・ポリティクスのRL手法の両方に適用でき、この手法を3つの一般的なRL手法(REINFORCE, Proximal Policy Optimization (PPO),Soft Actor-Critic (SAC))に対して実証的に有効性を示す。我々は,サンプル予算を制約する場合に,一定の軌道長とエントロピー正規化を用いる従来の手法よりもCCEの方が優れる実世界のシミュレーション実験を通して実証する。固定されたサンプル予算では、CCEは航法成功率18\%、航法パス長20-38\%、高架コスト9.32\%を達成している。さらに,CCEをClearpath Huskyロボットに統合し,複雑な屋外環境に適用可能であることを示す。

We introduce Confidence-Controlled Exploration (CCE), a novel exploration scheme designed to enhance the training sample efficiency of reinforcement learning (RL) algorithms for sparse reward settings such as robot navigation. Sparse rewards are common in RL and convenient to design and implement, but typically hard to deal with due to the challenges of exploration. Existing methods deploy regularization-based methods to deal with the exploration challenges. However, it is hard to characterize the balance between exploration and exploitation because regularization modifies the reward function itself, hence changing the objective we are optimizing for. In contrast to regularization-based approaches in the existing literature, our approach, CCE, is based on a novel relationship we provide between gradient estimation and policy entropy. CCE dynamically adjusts the number of samples of the gradient update used during training to control exploration. Interestingly, CCE can be applied to both existing on-policy and off-policy RL methods, which we demonstrate by empirically validating its efficacy on three popular RL methods: REINFORCE, Proximal Policy Optimization (PPO), and Soft Actor-Critic (SAC) for goal-reaching robotic navigation tasks. We demonstrate through simulated and real-world experiments that CCE outperforms conventional methods that employ constant trajectory lengths and entropy regularization when constraining the sample budget. For a fixed sample budget, CCE achieves an 18\% increase in navigation success rate, a 20-38\% reduction in navigation path length, and a 9.32\% decrease in elevation costs. Furthermore, we showcase the versatility of CCE by integrating it with the Clearpath Husky robot, illustrating its applicability in complex outdoor environments.

公開日:2024-09-24
翻訳日:2024-11-09 15:02:22

# 雑音を考慮した自己教師付き学習と効率的なエンコーダによる時系列符号化の改善

Improving Time Series Encoding with Noise-Aware Self-Supervised Learning and an Efficient Encoder ( http://arxiv.org/abs/2306.06579v2 )

ライセンス: Link先を確認

Duy A. Nguyen, Trang H. Tran, Huy Hieu Pham, Phi Le Nguyen, Lam M. Nguyen,

(参考訳) 本研究では,自己教師付き手法を用いた時系列表現学習問題について検討する。コントラスト学習はこの分野でよく知られており、シリーズから情報を抽出し、タスクに適した表現を生成するための強力な方法である。時系列の特徴を捉える能力にもかかわらず、これらの手法は、しばしば重要な要因である、この種のデータに固有のノイズを見落としている。さらに、効率的な軽量エンコーダアーキテクチャの開発には注目すべき注意が払われていない。本研究は,自然時系列における雑音波信号の存在を考慮し,一貫した表現学習を促進する革新的な学習戦略を提案することによって,これらのギャップに対処する。さらに,インセプションブロック内に拡張畳み込みを組み込んだエンコーダアーキテクチャを提案する。実験結果から, 予測, 分類, 異常検出など, 様々なタスクにおいて, 最先端のアプローチを一貫して上回る結果が得られた。特に,本手法はUCRデータセットの分類の3分の2以上で上位にランクされ,第2のアプローチと比較してパラメータの40%しか利用されていない。 CoInceptionフレームワークのソースコードはhttps://github.com/anhduy0911/CoInception.comからアクセスできます。

In this work, we investigate the time series representation learning problem using self-supervised techniques. Contrastive learning is well-known in this area as it is a powerful method for extracting information from the series and generating task-appropriate representations. Despite its proficiency in capturing time series characteristics, these techniques often overlook a critical factor - the inherent noise in this type of data, a consideration usually emphasized in general time series analysis. Moreover, there is a notable absence of attention to developing efficient yet lightweight encoder architectures, with an undue focus on delivering contrastive losses. Our work address these gaps by proposing an innovative training strategy that promotes consistent representation learning, accounting for the presence of noise-prone signals in natural time series. Furthermore, we propose an encoder architecture that incorporates dilated convolution within the Inception block, resulting in a scalable and robust network with a wide receptive field. Experimental findings underscore the effectiveness of our method, consistently outperforming state-of-the-art approaches across various tasks, including forecasting, classification, and abnormality detection. Notably, our method attains the top rank in over two-thirds of the classification UCR datasets, utilizing only 40% of the parameters compared to the second-best approach. Our source code for CoInception framework is accessible at https://github.com/anhduy0911/CoInception.

公開日:2024-10-05
翻訳日:2024-11-09 15:02:22

# 近似制約最適化のための自己教師付きEquality Embedded Deep Lagrange Dual

Self-supervised Equality Embedded Deep Lagrange Dual for Approximate Constrained Optimization ( http://arxiv.org/abs/2306.06674v5 )

ライセンス: Link先を確認

Minsoo Kim, Hongseok Kim,

(参考訳) 従来の解法はしばしば、特に大規模かつ時間クリティカルな問題において、制約付き最適化のために計算コストがかかる。これにより、ニューラルネットワーク(NN)を高速な最適解近似器として使用することへの関心が高まっているが、NNに制約を組み込むことは難しい。本稿では,ラベルを使わずに最適な解を求めるフレームワークであるDeepLDE(DeepLDE)を提案する。実現可能なソリューションを確保するため、NNに等価性制約を組み込み、未等式制約を課すために原始双対法を用いてNNを訓練する。さらに,DeepLDEの収束性を証明し,本手法だけでは等式埋め込みの助けなしには等式制約を保証できないことを示す。コンベックス,非凸,AC最適電力流(AC-OPF)問題に関するシミュレーション結果から,提案したDeepLDEはNNベースの全アプローチの中で最小の最適性ギャップを達成でき,かつ常に実現可能な解を確保できることを示す。さらに,提案手法の計算時間はDC3の約5～250倍であり,制約付き凸の解法,非凸最適化,AC-OPFの解法が提案されている。

Conventional solvers are often computationally expensive for constrained optimization, particularly in large-scale and time-critical problems. While this leads to a growing interest in using neural networks (NNs) as fast optimal solution approximators, incorporating the constraints with NNs is challenging. In this regard, we propose deep Lagrange dual with equality embedding (DeepLDE), a framework that learns to find an optimal solution without using labels. To ensure feasible solutions, we embed equality constraints into the NNs and train the NNs using the primal-dual method to impose inequality constraints. Furthermore, we prove the convergence of DeepLDE and show that the primal-dual learning method alone cannot ensure equality constraints without the help of equality embedding. Simulation results on convex, non-convex, and AC optimal power flow (AC-OPF) problems show that the proposed DeepLDE achieves the smallest optimality gap among all the NN-based approaches while always ensuring feasible solutions. Furthermore, the computation time of the proposed method is about 5 to 250 times faster than DC3 and the conventional solvers in solving constrained convex, non-convex optimization, and/or AC-OPF.

公開日:2024-09-23
翻訳日:2024-11-09 15:02:22

# 高次元過度線形回帰における最小ノルムリスクのバッチ安定化

Batches Stabilize the Minimum Norm Risk in High Dimensional Overparameterized Linear Regression ( http://arxiv.org/abs/2306.08432v3 )

ライセンス: Link先を確認

Shahar Stein Ioushua, Inbar Hasidim, Ofer Shayevitz, Meir Feder,

(参考訳) データをバッチに分割する学習アルゴリズムは、多くの機械学習アプリケーションで一般的であり、典型的には計算効率と性能のトレードオフを提供する。本稿では,等方的ガウス特徴を持つ最小ノルム過パラメータ線形回帰モデルのレンズによるバッチ分割の利点について検討する。最小ノルム推定器の自然な小バッチ版を提案し、その二次リスクを導出する。次に、最適なバッチサイズを特徴付け、ノイズレベルと過度パラメータ比に逆比例することを示す。最小ノルムとは対照的に,我々の推定器は過パラメトリゼーション比で単調に増加する安定なリスク挙動を認め,補間点での爆発と二重発振現象の両方を除去する。さらに、Weiner係数に等しい係数によるバッチ最小ノルム推定器の縮小がさらに安定化し、全ての設定において2次リスクを低くすることを示した。興味深いことに、バッチパーティションによって提供される暗黙の正規化は、バッチ間の機能の重複によって部分的に説明される。我々の境界は、新しい手法の組み合わせ、特にランダム部分空間上の雑音射影のワッサーシュタイン計量の正規近似によって導かれる。

Learning algorithms that divide the data into batches are prevalent in many machine-learning applications, typically offering useful trade-offs between computational efficiency and performance. In this paper, we examine the benefits of batch-partitioning through the lens of a minimum-norm overparametrized linear regression model with isotropic Gaussian features. We suggest a natural small-batch version of the minimum-norm estimator and derive bounds on its quadratic risk. We then characterize the optimal batch size and show it is inversely proportional to the noise level, as well as to the overparametrization ratio. In contrast to minimum-norm, our estimator admits a stable risk behavior that is monotonically increasing in the overparametrization ratio, eliminating both the blowup at the interpolation point and the double-descent phenomenon. We further show that shrinking the batch minimum-norm estimator by a factor equal to the Weiner coefficient further stabilizes it and results in lower quadratic risk in all settings. Interestingly, we observe that the implicit regularization offered by the batch partition is partially explained by feature overlap between the batches. Our bound is derived via a novel combination of techniques, in particular normal approximation in the Wasserstein metric of noisy projections over random subspaces.

公開日:2024-09-21
翻訳日:2024-11-09 15:02:22

# 平板最小値探索のための雑音安定性最適化:ヘッセン系正規化手法

Noise Stability Optimization for Finding Flat Minima: A Hessian-based Regularization Approach ( http://arxiv.org/abs/2306.08553v4 )

ライセンス: Link先を確認

Hongyang R. Zhang, Dongyue Li, Haotian Ju,

(参考訳) 過度にパラメータ化されたニューラルネットワークのトレーニングは、最近の文献で多くの研究を受けている。重要な考慮事項は、その非凸性や非線形幾何学のため、過度にパラメータ化されたネットワークの正規化である。本稿では、損失のヘシアンを正規化できるノイズ注入アルゴリズムについて検討し、平面的な損失面を持つ領域を導出する。具体的には、ニューラルネットワークの重み行列に等方性ガウスノイズを注入することにより、ヘッセンの痕跡のほぼ偏りのない推定値を得ることができる。しかし、バックプロパゲーション前に重み行列にノイズを加えることでノイズ注入を鼻で行うと、経験的改善は限られる。この制限に対処するために、ランダムノイズの正方向と負方向の両方に沿って重み行列に雑音を注入するヘッセンペナルティの2点推定を設計する。特に、この2点推定は、ヘッセン上の一階テイラーの展開項の分散を排除している。我々は、データから測定できるヘッセン(および重み空間の半径)のトレースに依存するPAC-ベイズ一般化の有界性を示す。我々は,我々のアプローチを検証するための詳細な実験を行い,ヘッセン語を効果的に正則化し,一般化を向上させることができることを示す。まず,6つの画像分類データセット上での微調整ResNetの精度を最大2.4%向上させることができる。さらに、ヘッセンの痕跡は15.8%減少し、最大の固有値は我々のアプローチにより9.7%減少する。また、ヘッセンの正則化と重みの減衰とデータ増大が組み合わされ、より強い正則化がもたらされる。第2に,本手法はマルチモーダルCLIPモデルとチェーン・オブ・ファインタニングの事前学習における一般化の改善に有効である。

The training of over-parameterized neural networks has received much study in recent literature. An important consideration is the regularization of over-parameterized networks due to their highly nonconvex and nonlinear geometry. In this paper, we study noise injection algorithms, which can regularize the Hessian of the loss, leading to regions with flat loss surfaces. Specifically, by injecting isotropic Gaussian noise into the weight matrices of a neural network, we can obtain an approximately unbiased estimate of the trace of the Hessian. However, naively implementing the noise injection via adding noise to the weight matrices before backpropagation presents limited empirical improvements. To address this limitation, we design a two-point estimate of the Hessian penalty, which injects noise into the weight matrices along both positive and negative directions of the random noise. In particular, this two-point estimate eliminates the variance of the first-order Taylor's expansion term on the Hessian. We show a PAC-Bayes generalization bound that depends on the trace of the Hessian (and the radius of the weight space), which can be measured from data. We conduct a detailed experimental study to validate our approach and show that it can effectively regularize the Hessian and improve generalization. First, our algorithm can outperform prior approaches on sharpness-reduced training, delivering up to a 2.4% test accuracy increase for fine-tuning ResNets on six image classification datasets. Moreover, the trace of the Hessian reduces by 15.8%, and the largest eigenvalue is reduced by 9.7% with our approach. We also find that the regularization of the Hessian can be combined with weight decay and data augmentation, leading to stronger regularization. Second, our approach remains effective for improving generalization in pretraining multimodal CLIP models and chain-of-thought fine-tuning.

公開日:2024-09-23
翻訳日:2024-11-09 15:02:22

# OpenOOD v1.5: アウト・オブ・ディストリビューション検出のためのベンチマーク強化

OpenOOD v1.5: Enhanced Benchmark for Out-of-Distribution Detection ( http://arxiv.org/abs/2306.09301v4 )

ライセンス: Link先を確認

Jingyang Zhang, Jingkang Yang, Pengyun Wang, Haoqi Wang, Yueqian Lin, Haoran Zhang, Yiyou Sun, Xuefeng Du, Yixuan Li, Ziwei Liu, Yiran Chen, Hai Li,

(参考訳) アウト・オブ・ディストリビューション(OOD)検出は、オープンワールド・インテリジェントシステムの信頼性の高い運用に不可欠である。 OOD検出手法の出現にもかかわらず、評価の不整合は、この分野の進歩を追跡する上での課題である。 OpenOOD v1はOOD検出評価の統合を開始したが、スケーラビリティとユーザビリティの制限に直面した。本報告では,OOD検出手法の精度,標準化,ユーザフレンドリな評価を保証したOpenOOD v1.5を提案する。特に、OpenOOD v1.5は、評価機能をImageNetなどの大規模データセットに拡張し、未調査の重要でないフルスペクトルOOD検出を調査し、オンラインリーダーボードや使いやすい評価器などの新機能を導入している。この研究は、総合的な実験結果から得られた深い分析や洞察にも貢献し、OOD検出手法の知識プールを強化している。これらの拡張により、OpenOOD v1.5は進歩を加速し、OOD検出研究のためのより堅牢で包括的な評価ベンチマークを提供することを目的としている。

Out-of-Distribution (OOD) detection is critical for the reliable operation of open-world intelligent systems. Despite the emergence of an increasing number of OOD detection methods, the evaluation inconsistencies present challenges for tracking the progress in this field. OpenOOD v1 initiated the unification of the OOD detection evaluation but faced limitations in scalability and usability. In response, this paper presents OpenOOD v1.5, a significant improvement from its predecessor that ensures accurate, standardized, and user-friendly evaluation of OOD detection methodologies. Notably, OpenOOD v1.5 extends its evaluation capabilities to large-scale datasets such as ImageNet, investigates full-spectrum OOD detection which is important yet underexplored, and introduces new features including an online leaderboard and an easy-to-use evaluator. This work also contributes in-depth analysis and insights derived from comprehensive experimental results, thereby enriching the knowledge pool of OOD detection methodologies. With these enhancements, OpenOOD v1.5 aims to drive advancements and offer a more robust and comprehensive evaluation benchmark for OOD detection research.

公開日:2024-09-24
翻訳日:2024-11-09 15:02:22

# 時空間量子相関の因果分類

Causal classification of spatiotemporal quantum correlations ( http://arxiv.org/abs/2306.09336v2 )

ライセンス: Link先を確認

Minjeong Song, Varun Narasimhachar, Bartosz Regula, Thomas J. Elliott, Mile Gu,

(参考訳) 測定結果のみの相関から、そのような相関が一時的なものであるかどうかを2つの孤立した当事者が決定できるだろうか? つまり、2つの異なるタイミングで同じシステムを与えられたと判断できるのだろうか? 古典的な統計によると、量子論は一致しない。ここでは、そのような量子相関を時間的に特定できる必要十分条件を紹介する。時間反転下での時間的非対称性を実証し,空間的量子相関の尺度であることを明らかにした。以上の結果から,特定の量子相関は時間的固有矢印を持ち,様々な因果構造との整合性に基づいて,時空間における一般量子相関の分類が可能であることが示唆された。

From correlations in measurement outcomes alone, can two otherwise isolated parties establish whether such correlations are atemporal? That is, can they rule out that they have been given the same system at two different times? Classical statistics says no, yet quantum theory disagrees. Here, we introduce the necessary and sufficient conditions by which such quantum correlations can be identified as atemporal. We demonstrate the asymmetry of atemporality under time reversal, and reveal it to be a measure of spatial quantum correlation distinct from entanglement. Our results indicate that certain quantum correlations possess an intrinsic arrow of time, and enable classification of general quantum correlations across space-time based on their (in)compatibility with various underlying causal structures.

公開日:2024-09-23
翻訳日:2024-11-09 15:02:22

# 直接検出のための光学式暗黒物質計

Optomechanical dark matter instrument for direct detection ( http://arxiv.org/abs/2306.09726v2 )

ライセンス: Link先を確認

Christopher G. Baker, Warwick P. Bowen, Peter Cox, Matthew J. Dolan, Maxim Goryachev, Glen Harris,

(参考訳) 低質量暗黒物質を直接検出するための新しい手法を応用したオプトメカニカルダークマターインストゥルメント(ODIN)を提案する。我々は,超流動ヘリウムと相互作用する暗黒物質を光学的空洞で考える。有効場理論を用いて,暗黒物質がフォノンから散乱する速度を,高密度で駆動されるキャビティの音響モードで計算する。この散乱過程は、フォノンを基底状態の第2音響モードに堆積させる。堆積されたフォノン (\mu$eV range) は、ポンプレーザーとの光学的相互作用により光子(eV range)に変換される。この光子を効率よく検出することができ、keVスケールの暗黒物質を感度よくプローブする手段を提供する。我々は,背景の現実的な推定を行い,そのような実験に関連する技術的課題について議論する。我々は、0.5から300keVまでの暗黒物質質量に対する暗黒物質-核子相互作用の予測限界を計算し、将来の装置が$\mathcal{O}(10^{-32})$ cm$^2$の低い断面を探査できると推定した。

We propose the Optomechanical Dark-matter INstrument (ODIN), based on a new method for the direct detection of low-mass dark matter. We consider dark matter interacting with superfluid helium in an optomechanical cavity. Using an effective field theory, we calculate the rate at which dark matter scatters off phonons in a highly populated, driven acoustic mode of the cavity. This scattering process deposits a phonon into a second acoustic mode in its ground state. The deposited phonon ($\mu$eV range) is then converted to a photon (eV range) via an optomechanical interaction with a pump laser. This photon can be efficiently detected, providing a means to sensitively probe keV scale dark matter. We provide realistic estimates of the backgrounds and discuss the technical challenges associated with such an experiment. We calculate projected limits on dark matter-nucleon interactions for dark matter masses ranging from 0.5 to 300 keV and estimate that a future device could probe cross-sections as low as $\mathcal{O}(10^{-32})$ cm$^2$.

公開日:2024-09-24
翻訳日:2024-11-09 14:51:04

# フェデレーション学習のための視覚変換器の連続的適応

Continual Adaptation of Vision Transformers for Federated Learning ( http://arxiv.org/abs/2306.09970v2 )

ライセンス: Link先を確認

Shaunak Halbe, James Seale Smith, Junjiao Tian, Zsolt Kira,

(参考訳) 本稿では,サーバがクライアントの集合と通信し,データを共有したり保存したりすることなく,新たな概念を段階的に学習する,CFL(Continuousal Federated Learning)の重要な課題に焦点を当てる。この問題の複雑さは、継続学習とフェデレート学習の両方の観点からの課題によって複雑化されます。具体的には、CFLセットアップでトレーニングされたモデルは、クライアント間のデータの異質性によって悪化する破滅的な忘れ込みに悩まされる。この問題に対する既存の試みは、クライアントや通信チャネルに大きなオーバーヘッドを課す傾向にあり、あるいは保存されたデータにアクセスする必要があるため、プライバシによる実際の使用には適さない。本稿では,記憶データへのアクセスを必要とせず,オーバーヘッドコストを最小限に抑えながら,忘れと不均一性に取り組む。本研究では,視覚変換器の文脈でこの問題を考察し,動的分布に適応するパラメータ効率のアプローチを,最小限に抑えながら検討する。我々は、プロンプトベースのアプローチ(プロンプトとクラシファイアヘッドのみを通信しなければならない)を活用し、サーバにおけるクライアントモデルを統合するための、新しくて軽量な生成と蒸留方式を提案する。我々は、画像分類の問題を定式化し、比較のための強力なベースラインを確立し、CIFAR-100上で実験を行い、ImageNet-RやDomainNetのような大規模データセットに挑戦する。提案手法は,通信コストとクライアントレベルの計算コストを大幅に削減しつつ,既存手法と独自のベースラインを最大7%向上させる。コードはhttps://github.com/shaunak27/hepco-fed.comで公開されている。

In this paper, we focus on the important yet understudied problem of Continual Federated Learning (CFL), where a server communicates with a set of clients to incrementally learn new concepts over time without sharing or storing any data. The complexity of this problem is compounded by challenges from both the Continual and Federated Learning perspectives. Specifically, models trained in a CFL setup suffer from catastrophic forgetting which is exacerbated by data heterogeneity across clients. Existing attempts at this problem tend to impose large overheads on clients and communication channels or require access to stored data which renders them unsuitable for real-world use due to privacy. In this paper, we attempt to tackle forgetting and heterogeneity while minimizing overhead costs and without requiring access to any stored data. We study this problem in the context of Vision Transformers and explore parameter-efficient approaches to adapt to dynamic distributions while minimizing forgetting. We achieve this by leveraging a prompting based approach (such that only prompts and classifier heads have to be communicated) and proposing a novel and lightweight generation and distillation scheme to consolidate client models at the server. We formulate this problem for image classification and establish strong baselines for comparison, conduct experiments on CIFAR-100 as well as challenging, large-scale datasets like ImageNet-R and DomainNet. Our approach outperforms both existing methods and our own baselines by as much as 7% while significantly reducing communication and client-level computation costs. Code available at https://github.com/shaunak27/hepco-fed.

公開日:2024-09-22
翻訳日:2024-11-09 14:51:04

# 準周期モザイク格子における多動性エッジの探索

Probing multi-mobility edges in quasiperiodic mosaic lattices ( http://arxiv.org/abs/2306.10829v2 )

ライセンス: Link先を確認

Jun Gao, Ivan M. Khaymovich, Xiao-Wei Wang, Ze-Sheng Xu, Adrian Iovan, Govind Krishna, Jiayidaer Jieensi, Andrea Cataldo, Alexander V. Balatsky, Val Zwiller, Ali W. Elshaari,

(参考訳) モビリティエッジ(ME)は、エネルギースペクトルにおける局所化状態と局所化状態の間の重要な遷移を示す、局在化物理学を理解するための重要な概念である。アンダーソン局在化スケーリング理論は、低次元系におけるMEの欠如を予測する。そのため、特に低次元の単一粒子に対する正確なMEの探索は、最近理論と実験的研究の両方に大きな関心を集め、顕著な進歩をもたらした。しかし、複数のMEを示す単一のシステムや、強い障害領域内であっても、拡張状態の持続的な存在の可能性など、いくつかのオープンな疑問が残っている。ここでは、準周期モザイク格子と精密に設計されたナノフォトニック回路を用いて、これらの問題に対処する実験的な証拠を提供する。本研究は, 2次対称性の破れと変調周期の異なる格子における拡張状態と局所状態の共存を実証するものである。単一サイトインジェクションと障害レベルの走査により,変調格子のMEを概ね調査することができた。これらの結果は、最近の理論予測を裏付け、ME物理を研究するための新しい道を導入し、ハイブリッド集積フォトニックデバイスを用いた量子状態におけるME物理のさらなる探索にインスピレーションを与える。

The mobility edge (ME) is a crucial concept in understanding localization physics, marking the critical transition between extended and localized states in the energy spectrum. Anderson localization scaling theory predicts the absence of ME in lower dimensional systems. Hence, the search for exact MEs, particularly for single particles in lower dimensions, has recently garnered significant interest in both theoretical and experimental studies, resulting in notable progress. However, several open questions remain, including the possibility of a single system exhibiting multiple MEs and the continual existence of extended states, even within the strong disorder domain. Here, we provide experimental evidence to address these questions by utilizing a quasiperiodic mosaic lattice with meticulously designed nanophotonic circuits. Our observations demonstrate the coexistence of both extended and localized states in lattices with broken duality symmetry and varying modulation periods. By single site injection and scanning the disorder level, we could approximately probe the ME of the modulated lattice. These results corroborate recent theoretical predictions, introduce a new avenue for investigating ME physics, and offer inspiration for further exploration of ME physics in the quantum regime using hybrid integrated photonic devices.

公開日:2024-09-23
翻訳日:2024-11-09 14:51:04

# 条件付きデュアルオートエンコーダによる暗黒ショータのトリガ

Triggering Dark Showers with Conditional Dual Auto-Encoders ( http://arxiv.org/abs/2306.12955v2 )

ライセンス: Link先を確認

Luca Anzalone, Simranjit Singh Chhibra, Benedikt Maier, Nadezda Chernyavskaya, Maurizio Pierini,

(参考訳) 本稿では,コライダにおける一般およびモデルに依存しない新しい物理探索のための条件付きデュアルオートエンコーダ(CoDAE)のファミリーを提案する。新たな種類の粒子や相互作用から生じる新しい物理信号は、予測される背景事象に対するデータの偏差を引き起こす異常であると考えられる。本研究では,背景サンプルのみを用いた正常な異常検出を行い,物理ベースの前処理や信号に対する強い仮定を使わずに,大規模かつ疎度な生検出器画像に(変分的)オートエンコーダを適用した強力のダークバージョンを探索する。提案したCoDAEは双対エンコーダ設計であり、空間条件付けにより補助的かつコンパクトなラテント空間を学習できる。 ATLASやCMSのような大型ハドロン衝突型加速器実験のリアルタイムイベントトリガシステムにおいて、この手法が正確で高速でモデルに依存しないアルゴリズムとして適用可能であることを示すため、教師なしモデルが複数のダークシャワーモデルに対して優れた差別を示すことは初めてである。

We present a family of conditional dual auto-encoders (CoDAEs) for generic and model-independent new physics searches at colliders. New physics signals, which arise from new types of particles and interactions, are considered in our study as anomalies causing deviations in data with respect to expected background events. In this work, we perform a normal-only anomaly detection, which employs only background samples, to search for manifestations of a dark version of strong force applying (variational) auto-encoders on raw detector images, which are large and highly sparse, without leveraging any physics-based pre-processing or strong assumption on the signals. The proposed CoDAE has a dual-encoder design, which is general and can learn an auxiliary yet compact latent space through spatial conditioning, showing a neat improvement over competitive physics-based baselines and related approaches, therefore also reducing the gap with fully supervised models. It is the first time an unsupervised model is shown to exhibit excellent discrimination against multiple dark shower models, illustrating the suitability of this method as an accurate, fast, model-independent algorithm to deploy, e.g., in the real-time event triggering systems of Large Hadron Collider experiments such as ATLAS and CMS.

公開日:2024-09-24
翻訳日:2024-11-09 14:51:04

# HamLib: 量子アルゴリズムとハードウェアのベンチマークのためのハミルトンのライブラリ

HamLib: A library of Hamiltonians for benchmarking quantum algorithms and hardware ( http://arxiv.org/abs/2306.13126v4 )

ライセンス: Link先を確認

Nicolas PD Sawaya, Daniel Marti-Dafcik, Yang Ho, Daniel P Tabor, David E Bernal Neira, Alicia B Magann, Shavindra Premaratne, Pradeep Dubey, Anne Matsuura, Nathan Bishop, Wibe A de Jong, Simon Benjamin, Ojas Parekh, Norm Tubman, Katherine Klymko, Daan Camps,

(参考訳) 計算ハードウェア、ソフトウェア、アルゴリズムを特徴付け、ベンチマークするためには、多くの問題インスタンスを手元に持つことが不可欠である。これは量子計算に当てはまるものではなく、実世界の問題インスタンスの集合がベンチマーク研究を可能にし、アルゴリズムとハードウェアの設計の両方を改善するのに役立つ。この目的のために、量子ハミルトニアンの大規模なデータセットを提示する。 HamLib(ハミルトン図書館)と呼ばれるこのデータセットは、オンラインで無料で利用可能であり、2から1000キュービットまでの問題サイズを含んでいる。 HamLibには、Heisenbergモデル、Fermi-Hubbardモデル、Bose-Hubbardモデル、分子電子構造、分子振動構造、MaxCut、Max-$k$-SAT、Max-$k$-Cut、QMaxCut、旅行セールスパーソンの問題が含まれている。この努力の目標は (a)問題インスタンスを作成してキュービット表現にマッピングする必要をなくして研究者の時間を節約すること。 (b)新しいアルゴリズムやハードウェアをより徹底的にテストできるようにし、 (c) 研究における再現性と標準化を可能にすること。

In order to characterize and benchmark computational hardware, software, and algorithms, it is essential to have many problem instances on-hand. This is no less true for quantum computation, where a large collection of real-world problem instances would allow for benchmarking studies that in turn help to improve both algorithms and hardware designs. To this end, here we present a large dataset of qubit-based quantum Hamiltonians. The dataset, called HamLib (for Hamiltonian Library), is freely available online and contains problem sizes ranging from 2 to 1000 qubits. HamLib includes problem instances of the Heisenberg model, Fermi-Hubbard model, Bose-Hubbard model, molecular electronic structure, molecular vibrational structure, MaxCut, Max-$k$-SAT, Max-$k$-Cut, QMaxCut, and the traveling salesperson problem. The goals of this effort are (a) to save researchers time by eliminating the need to prepare problem instances and map them to qubit representations, (b) to allow for more thorough tests of new algorithms and hardware, and (c) to allow for reproducibility and standardization across research studies.

公開日:2024-09-24
翻訳日:2024-11-09 14:51:04

# Universal Session Protocol: リモートコードの実行に対する一般的な解決策

Universal Session Protocol: A General Solution to Remote Code Execution ( http://arxiv.org/abs/2306.14339v2 )

ライセンス: Link先を確認

Jonathon Anderson,

(参考訳) 現在、TCP/IPモデルは、アプリケーションへの接続に対するすべての要求を無条件で満たすことで、匿名で脆弱性を悪用することができる。私は、TCP/IPモデルのアーキテクチャの変更としてユニバーサルセッションプロトコルを提案しており、認証交渉と履行のための構造化された汎用プロセスを含むセッション層を含んでいます。ユニバーサルセッションプロトコルは、セキュリティクリティカルシステムにおける不正なデータ処理を排除する緊急かつ重要な必要性に対処する。 TCP/IPセキュリティに関するこれまでの研究は、アプリケーション設計と実装、および既存のプロトコル層に重点を置いていたが、緩和制御としてセッション層を追加することに失敗した。異なる認証レイヤを実装することに失敗すると、ライフとセキュリティクリティカルなインフラストラクチャを含む、グローバルインターネットに接続されたすべてのリソースが、匿名で追跡不能なソースからの攻撃に脆弱になる。 Universal Session ProtocolはTCP/IP Session Layerを確立することでソリューションを提供する。認証後、IDはデータストリームに関連付けられ、すべてのデータが法医学的な目的のためにそのIDに関連付けられている可能性がある。認証が失敗した場合、アプリケーションはユーザーデータを決して処理せず、サービスは匿名の悪いアクターから安全になる。

Currently, the TCP/IP model enables exploitation of vulnerabilities anonymously by unconditionally fulfilling every request for a connection into an application; the model only incorporates authentication within applications themselves, rather than as a precondition for access into applications. I am proposing the Universal Session Protocol as a change to the architecture of the TCP/IP model to include a session layer featuring a structured generalized process for authentication negotiation and fulfillment. The Universal Session Protocol addresses an urgent and vital need to eliminate unauthenticated data processing on security critical systems. Previous work regarding TCP/IP security has focused on the application design and implementation and existing protocol layers, but has failed to posit the addition of a session layer as a mitigating control. Failing to implement a distinct authentication layer leaves every resource connected to the global Internet, including life and security critical infrastructure, vulnerable to attacks from anonymous and untraceable sources. The Universal Session Protocol provides a solution by establishing a TCP/IP Session Layer that explicitly provides authentication before a data stream is accessible within an application. After authentication, an identity is associated with the data stream so that all data may be related back to that identity for forensic purposes. If authentication fails, the application will never process user data, rendering the service safe from anonymous bad actors.

公開日:2024-09-24
翻訳日:2024-11-09 14:51:04

# 時間と状態依存型ニューラル遅延微分方程式

Time and State Dependent Neural Delay Differential Equations ( http://arxiv.org/abs/2306.14545v2 )

ライセンス: Link先を確認

Thibault Monsel, Onofrio Semeraro, Lionel Mathelin, Guillaume Charpiat,

(参考訳) 物理学や工学から医学、経済学まで、幅広い種類の問題の統治方程式において、不連続性と遅延項が遭遇する。これらのシステムは、標準常微分方程式(ODE)やニューラル常微分方程式(NODE)のようなデータ駆動近似で適切にモデル化およびシミュレーションすることはできない。この問題を回避するために、潜伏変数は一般に高次元空間における系の力学を解き、元の空間への射影として解を得るために導入される。しかし、この解は物理的解釈可能性に欠ける。対照的に、DDE(Delay Differential Equations)とそのデータ駆動の近似方程式は、このようなシステムを特徴づける良い候補として自然に現れる。本稿では,複数および状態依存遅延をモデル化可能な汎用かつ柔軟なフレームワークであるNeural State-Dependent DDE(SDDDE)を導入することで,最近提案されたNeural DDEを再考する。提案手法は競争力があり,様々な遅延力学系における他の連続クラスモデルよりも優れていることを示す。コードはリポジトリ \href{https://github.com/thibmonsel/Time-and-State-Dependent-Neural-Delay-Differential-Equations}{here} で公開されている。

Discontinuities and delayed terms are encountered in the governing equations of a large class of problems ranging from physics and engineering to medicine and economics. These systems cannot be properly modelled and simulated with standard Ordinary Differential Equations (ODE), or data-driven approximations such as Neural Ordinary Differential Equations (NODE). To circumvent this issue, latent variables are typically introduced to solve the dynamics of the system in a higher dimensional space and obtain the solution as a projection to the original space. However, this solution lacks physical interpretability. In contrast, Delay Differential Equations (DDEs), and their data-driven approximated counterparts, naturally appear as good candidates to characterize such systems. In this work we revisit the recently proposed Neural DDE by introducing Neural State-Dependent DDE (SDDDE), a general and flexible framework that can model multiple and state- and time-dependent delays. We show that our method is competitive and outperforms other continuous-class models on a wide variety of delayed dynamical systems. Code is available at the repository \href{https://github.com/thibmonsel/Time-and-State-Dependent-Neural-Delay-Differential-Equations}{here}.

公開日:2024-09-26
翻訳日:2024-11-09 14:51:04

# 4重境界誤差再分別による高品質未知オブジェクトインスタンスセグメンテーション

High-quality Unknown Object Instance Segmentation via Quadruple Boundary Error Refinement ( http://arxiv.org/abs/2306.16132v3 )

ライセンス: Link先を確認

Seunghyeok Back, Sangbeom Lee, Kangmin Kim, Joosoon Lee, Sungho Shin, Jemo Maeng, Kyoobin Lee,

(参考訳) 非構造環境における未知の物体の高精度かつ効率的なセグメンテーションは、ロボット操作に不可欠である。 Unknown Object Instance Segmentation (UOIS)は、未知のカテゴリやバックグラウンドのすべてのオブジェクトを識別することを目的としており、様々なロボットタスクにおいて重要な機能となっている。しかし、現在の手法は過剰なセグメンテーションと過度のセグメンテーションに苦しむため、把握のような操作タスクでは失敗する。これらの課題に対処するため,我々は高品質なUOISのための新しい誤り情報処理手法QuBER(Quadruple boundary Error Refinement)を提案する。 QuBERはまず、初期セグメンテーションのインスタンス境界における4倍境界誤差-真正、真負、偽正、偽負の画素-を推定する。その後、エラー誘導融合機構を使用してセグメンテーションを洗練し、細粒度とインスタンスレベルのセグメンテーションエラーを効果的に補正する。 3つの公開ベンチマークの大規模な評価は、QuBERが最先端の手法より優れており、継続的に様々なUOIS技術を改善しつつ、0.1秒未満の高速な推論時間を維持していることを示している。さらに,QuBERは,乱雑な環境下での対象オブジェクトの把握の成功率を向上させることを実証した。コードと補足資料はhttps://sites.google.com/view/uois-quber.comで入手できる。

Accurate and efficient segmentation of unknown objects in unstructured environments is essential for robotic manipulation. Unknown Object Instance Segmentation (UOIS), which aims to identify all objects in unknown categories and backgrounds, has become a key capability for various robotic tasks. However, current methods struggle with over-segmentation and under-segmentation, leading to failures in manipulation tasks such as grasping. To address these challenges, we propose QuBER (Quadruple Boundary Error Refinement), a novel error-informed refinement approach for high-quality UOIS. QuBER first estimates quadruple boundary errors-true positive, true negative, false positive, and false negative pixels-at the instance boundaries of the initial segmentation. It then refines the segmentation using an error-guided fusion mechanism, effectively correcting both fine-grained and instance-level segmentation errors. Extensive evaluations on three public benchmarks demonstrate that QuBER outperforms state-of-the-art methods and consistently improves various UOIS techniques while maintaining a fast inference time of less than 0.1 seconds. Additionally, we demonstrate that QuBER improves the success rate of grasping target objects in cluttered environments. Code and supplementary materials are available at https://sites.google.com/view/uois-quber.

公開日:2024-09-23
翻訳日:2024-11-09 14:51:04

# ボソニックガウス流路の低地・高地容量領域解析

Low-ground/High ground capacity regions analysis for Bosonic Gaussian Channels ( http://arxiv.org/abs/2306.16350v2 )

ライセンス: Link先を確認

Farzad Kianvash, Marco Fanizza, Vittorio Giovannetti,

(参考訳) 本稿では, 単一モード, 位相非感受性ガウスボソニックチャネル間の相互接続の包括的特性について述べる。この特徴付けにより、これらのマップのパラメータ空間において、低地と高地という2つの異なる領域を特定できる。低地領域では、情報容量は指定基準値よりも小さく、高地領域では、確実に大きい。直接的な結果として、これらの写像の量子的およびプライベートな容量について、既知の上界と合成規則を組み合わせた明示的な上界の集合を体系的に概説し、既存の結果を改善する。

We present a comprehensive characterization of the interconnections between single-mode, phaseinsensitive Gaussian Bosonic Channels resulting from channel concatenation. This characterization enables us to identify, in the parameter space of these maps, two distinct regions: low-ground and high-ground. In the low-ground region, the information capacities are smaller than a designated reference value, while in the high-ground region, they are provably greater. As a direct consequence, we systematically outline an explicit set of upper bounds for the quantum and private capacity of these maps, which combine known upper bounds and composition rules, improving upon existing results.

公開日:2024-09-26
翻訳日:2024-11-09 14:51:04

# オルタナティブ・テレスコープ・アライメント : 効率的なマルチモーダルアライメント法

Alternative Telescopic Displacement: An Efficient Multimodal Alignment Method ( http://arxiv.org/abs/2306.16950v4 )

ライセンス: Link先を確認

Jiahao Qin, Yitao Xu, Zong Lu, Xiaojun Zhang,

(参考訳) マルチモーダルデータ統合の領域では、機能アライメントが重要な役割を果たす。本稿では,マルチモーダル情報の融合に革命をもたらす機能アライメントに対する革新的なアプローチを提案する。提案手法では,異なるモードをまたいだ特徴表現の遠隔的変位と拡張の新たな反復的プロセスを用いて,共有特徴空間内の一貫性のある統一表現を導出する。この高度な技術は、抽象の最高レベルにおいて複雑なクロスモーダル相互作用を捕捉し、活用する驚くべき能力を示している。その結果,マルチモーダル学習タスクの性能は大幅に向上した。厳密な比較分析により、様々なアプリケーションにまたがる既存のマルチモーダル融合パラダイムに対するアプローチの優位性を確立する。時系列,視覚データ,テキスト情報を含む多面的データセットを用いた総合的な経験的評価は,本手法がこの分野における前例のないベンチマークを達成していることを示す証拠となる。この研究は、マルチモーダル学習における最先端の進歩だけでなく、複雑な分析シナリオにおける異なるデータモダリティ間の相乗効果を探求するための新たな道を開いた。

In the realm of multimodal data integration, feature alignment plays a pivotal role. This paper introduces an innovative approach to feature alignment that revolutionizes the fusion of multimodal information. Our method employs a novel iterative process of telescopic displacement and expansion of feature representations across different modalities, culminating in a coherent unified representation within a shared feature space. This sophisticated technique demonstrates a remarkable ability to capture and leverage complex crossmodal interactions at the highest levels of abstraction. As a result, we observe significant enhancements in the performance of multimodal learning tasks. Through rigorous comparative analysis, we establish the superiority of our approach over existing multimodal fusion paradigms across a diverse array of applications. Comprehensive empirical evaluations conducted on multifaceted datasets encompassing temporal sequences, visual data, and textual information provide compelling evidence that our method achieves unprecedented benchmarks in the field. This work not only advances the state of the art in multimodal learning but also opens new avenues for exploring the synergies between disparate data modalities in complex analytical scenarios.

公開日:2024-09-25
翻訳日:2024-11-09 14:51:04

# QAOAのためのLXミキサー:部分空間に制限された最適ミキサーと安定化器形式

LX-mixers for QAOA: Optimal mixers restricted to subspaces and the stabilizer formalism ( http://arxiv.org/abs/2306.17083v6 )

ライセンス: Link先を確認

Franz G. Fuchs, Ruben Pariente Bassa,

(参考訳) 与えられた部分空間を保存するミキサーの理解と構築を両立させる新しい形式主義を提示する。この方法は、誤り訂正符号に使用される安定化器形式を接続して利用する。これは、組合せ最適化問題の解法として一般的なメタヒューリスティックである量子近似最適化アルゴリズム(QAOA)が、問題の制約が大きくて容易に指定可能な部分空間に導かれるような設定に適用される場合に有用である。提案手法は,制御されたノットゲートの数で資源効率のよいミキサーを構築する体系的な方法を提供し,よく知られたXとXYミキサーの一般化とGroverミキサーの緩和と理解することができる。得られた数値例では, 従来の結果と比較してCXゲートが劇的に減少していた。我々は、この部分空間を安定化器Sの符号空間に分割し、これらの符号空間に関連する論理回転Xゲートを連続的に適用するものとして理解することができるので、我々のアプローチを論理X-Mixerあるいは論理X QAOA(\textbf{LX-QAOA}$)と呼ぶ。全体として、この新しい視点が量子アルゴリズムの発展に関するさらなる洞察に繋がることを願っている。

We present a novel formalism to both understand and construct mixers that preserve a given subspace. The method connects and utilizes the stabilizer formalism that is used in error correcting codes. This can be useful in the setting when the quantum approximate optimization algorithm (QAOA), a popular meta-heuristic for solving combinatorial optimization problems, is applied in the setting where the constraints of the problem lead to a feasible subspace that is large but easy to specify. The proposed method gives a systematic way to construct mixers that are resource efficient in the number of controlled not gates and can be understood as a generalization of the well-known X and XY mixers and a relaxation of the Grover mixer: Given a basis of any subspace, a resource efficient mixer can be constructed that preserves the subspace. The numerical examples provided show a dramatic reduction of CX gates when compared to previous results. We call our approach logical X-Mixer or logical X QAOA ($\textbf{LX-QAOA}$), since it can be understood as dividing the subspace into code spaces of stabilizers S and consecutively applying logical rotational X gates associated with these code spaces. Overall, we hope that this new perspective can lead to further insight into the development of quantum algorithms.

公開日:2024-09-23
翻訳日:2024-11-09 14:51:04

# 拡散モデルによる色調の定式化と色移動

Dequantization and Color Transfer with Diffusion Models ( http://arxiv.org/abs/2307.02698v4 )

ライセンス: Link先を確認

Vaibhav Vavilala, Faaris Shaik, David Forsyth,

(参考訳) 自然画像の新規な画像編集を可能にする拡散モデルを提案する。パッチベースの編集やパレット転送を簡単に抽象化できるため,量子化画像の操作を提案する。特に,カラーパレットが拡散モデルの出力を制御し,解釈しやすくすることを示す。まず,JPEGノイズ低減モデルなど,既存の画像復元手法では不十分であることが確認された。次に、我々のモデルが、ユーザが要求したカラーパレットを尊重する自然な画像を生成できることを実証する。パレット転送のために,重み付き二分節マッチングに基づく手法を提案する。そこで本モデルでは, 極端なパレット転送後であっても, ユーザクエリを尊重して, 可視画像を生成することを示す。本手法は、画像の一部または全部のソーステクスチャを任意に条件付けすることができる。これにより、入力と異なる輝度で色を生成できない既存の画像カラー化手法において、一般的な問題を克服する。テクスチャコンディショニングや,輝度,画像勾配,しきい値勾配など,テクスチャコンディショニングとトレードオフの可能性を評価し,テクスチャコンディショニングとカラーコントロールの両立に最善を尽くした。本手法は,画像のテクスチャを尊重しながら,画像のパッチを塗り替えることによって,別の実用的な編集に拡張することができる。我々の手順は、いくつかの質的、定量的な評価によって支えられている。

We demonstrate an image dequantizing diffusion model that enables novel image edits on natural images. We propose operating on quantized images because they offer easy abstraction for patch-based edits and palette transfer. In particular, we show that color palettes can make the output of the diffusion model easier to control and interpret. We first establish that existing image restoration methods are not sufficient, such as JPEG noise reduction models. We then demonstrate that our model can generate natural images that respect the color palette the user asked for. For palette transfer, we propose a method based on weighted bipartite matching. We then show that our model generates plausible images even after extreme palette transfers, respecting user query. Our method can optionally condition on the source texture in part or all of the image. In doing so, we overcome a common problem in existing image colorization methods that are unable to produce colors with a different luminance than the input. We evaluate several possibilities for texture conditioning and their trade-offs, including luminance, image gradients, and thresholded gradients, the latter of which performed best in maintaining texture and color control simultaneously. Our method can be usefully extended to another practical edit: recoloring patches of an image while respecting the source texture. Our procedure is supported by several qualitative and quantitative evaluations.

公開日:2024-09-21
翻訳日:2024-11-09 14:51:04

# 恒常的ホモロジーランク関数を用いた推論の安定性

Stability for Inference with Persistent Homology Rank Functions ( http://arxiv.org/abs/2307.02904v2 )

ライセンス: Link先を確認

Qiquan Wang, Inés García-Redondo, Pierre Faugère, Gregory Henselman-Petrusek, Anthea Monod,

(参考訳) 永続ホモロジーバーコードとダイアグラムは、点雲、ネットワーク、関数など、幅広い複雑なデータ構造の「形」を捉えたトポロジ的データ解析の基盤である。しかし、その複雑な幾何学的構造のため、統計的な設定での使用は困難である。本稿では,統計と機械学習のツールとして,バーコードと永続化図に数学的に等価な永続的ホモロジーランク関数を再検討する。ランク関数は、関数であり、関数の形でデータに適合する統計の領域である、機能データ分析(FDA)の統計理論の直接的な適用を可能にする。しかし、実際にバーコードに対して提示される重要な課題は、安定性の欠如である。データの忠実な表現としての使用を検証する上で重要な特性であり、したがって実行可能な要約統計量である。本稿では,FDA 統合のための適切な基準の下で,永続的ホモロジーランク関数に対する2つの安定性結果を導出することにより,このギャップを埋める。次に、機能的推論統計学および機械学習におけるランク関数の性能を、単パラメータおよび多パラメータの永続的ホモロジーの両方において、実データアプリケーション上で研究する。階数関数によって捕捉される永続的ホモロジーの使用は、既存の非永続的アプローチよりも明らかな改善をもたらす。

Persistent homology barcodes and diagrams are a cornerstone of topological data analysis that capture the "shape" of a wide range of complex data structures, such as point clouds, networks, and functions. However, their use in statistical settings is challenging due to their complex geometric structure. In this paper, we revisit the persistent homology rank function, which is mathematically equivalent to a barcode and persistence diagram, as a tool for statistics and machine learning. Rank functions, being functions, enable the direct application of the statistical theory of functional data analysis (FDA)-a domain of statistics adapted for data in the form of functions. A key challenge they present over barcodes in practice, however, is their lack of stability-a property that is crucial to validate their use as a faithful representation of the data and therefore a viable summary statistic. In this paper, we fill this gap by deriving two stability results for persistent homology rank functions under a suitable metric for FDA integration. We then study the performance of rank functions in functional inferential statistics and machine learning on real data applications, in both single and multiparameter persistent homology. We find that the use of persistent homology captured by rank functions offers a clear improvement over existing non-persistence-based approaches.

公開日:2024-09-22
翻訳日:2024-11-09 14:51:04

# 二重クープマン回路からの多体カオスの解法モデル

Solvable models of many-body chaos from dual-Koopman circuits ( http://arxiv.org/abs/2307.04950v2 )

ライセンス: Link先を確認

Arul Lakshminarayan,

(参考訳) 二重単位回路は、相関関数や状態の時間発展のために正確に解ける多体量子カオスのモデルとして活発に研究されている。ここでは、それらの古典的対応を双対カノニカル変換と関連する双対コオプマン作用素と定義する。それらの量子対と同様に、相関は光円錐上を除いて至る所で消え、そこでは単純な縮約写像によって支配される速度で崩壊する。そのような双対正準変換の大規模なクラスを提供することで、結合された標準写像の例を詳細に研究し、系が混合している熱力学的極限において、可積分ケースから任意に離れていることを解析的に示す。また、光円錐上を含む至る所で相関が消滅する「完全」クープマン作用素を定義し、エルゴード階層の頂点においてベルヌーイ系であると見なされる猫写像格子の例を示す。

Dual-unitary circuits are being vigorously studied as models of many-body quantum chaos that can be solved exactly for correlation functions and time evolution of states. Here we define their classical counterparts as dual-canonical transformations and associated dual-Koopman operators. Like their quantum counterparts, the correlations vanish everywhere except on the light cone, on which they decay with rates governed by a simple contractive map. Providing a large class of such dual-canonical transformations, we study in detail the example of a coupled standard map and show analytically that arbitrarily away from the integrable case, in the thermodynamic limit the system is mixing. We also define ``perfect" Koopman operators that lead to the correlation vanishing everywhere including on the light cone and provide an example of a cat-map lattice which would qualify to be a Bernoulli system at the apex of the ergodic hierarchy.

公開日:2024-09-24
翻訳日:2024-11-09 14:51:04

# 資源制約を考慮した分散パラメータ推定における協調について

On Collaboration in Distributed Parameter Estimation with Resource Constraints ( http://arxiv.org/abs/2307.06442v2 )

ライセンス: Link先を確認

Yu-Zhen Janice Chen, Daniel S. Menasché, Don Towsley,

(参考訳) センサネットワーク、IoTシステム、分散コンピューティングにおける効果的なリソース割り当ては、環境監視、監視、スマートインフラストラクチャといったアプリケーションに不可欠である。センサやエージェントはパラメータ推定の精度を最大化するためにリソース割り当てを最適化する必要がある。本研究では,多変量ガウス分布の異なる変数からそれぞれサンプリングし,異なる推定対象を持つセンサ群やエージェント群について考察する。センサやエージェントのデータ収集や協調政策の設計問題をフィッシャー情報最大化(あるいはクレーマー・ラオ境界最小化)問題として定式化する。この定式化は、局所的な単変量サンプルの収集と多変量サンプルの生成の協調の間で、エネルギー利用の新たなトレードオフを捉えている。変数間の相関関係の知識が得られれば,(1)最適なデータ収集ポリシーが協調サンプリングのための情報伝達に資源を投入する,(2)サンプル間の相関関係の知識が推定効率を高めることができない,という2つの事例を解析的に同定する。相関関係の知識は利用できないが, 協調が有益である場合, 逐次分散パラメータ推定問題において, 最適なデータ収集と協調ポリシーを学習するために, マルチアームバンディットアルゴリズムを適用した新しいアプローチを提案する。本稿では,提案アルゴリズムであるDOUBLE-F, DOUBLE-Z, UCB-F, UCB-Zの有効性について述べる。

Effective resource allocation in sensor networks, IoT systems, and distributed computing is essential for applications such as environmental monitoring, surveillance, and smart infrastructure. Sensors or agents must optimize their resource allocation to maximize the accuracy of parameter estimation. In this work, we consider a group of sensors or agents, each sampling from a different variable of a multivariate Gaussian distribution and having a different estimation objective. We formulate a sensor or agent's data collection and collaboration policy design problem as a Fisher information maximization (or Cramer-Rao bound minimization) problem. This formulation captures a novel trade-off in energy use, between locally collecting univariate samples and collaborating to produce multivariate samples. When knowledge of the correlation between variables is available, we analytically identify two cases: (1) where the optimal data collection policy entails investing resources to transfer information for collaborative sampling, and (2) where knowledge of the correlation between samples cannot enhance estimation efficiency. When knowledge of certain correlations is unavailable, but collaboration remains potentially beneficial, we propose novel approaches that apply multi-armed bandit algorithms to learn the optimal data collection and collaboration policy in our sequential distributed parameter estimation problem. We illustrate the effectiveness of the proposed algorithms, DOUBLE-F, DOUBLE-Z, UCB-F, UCB-Z, through simulation.

公開日:2024-09-24
翻訳日:2024-11-09 14:51:04

# 風場の大規模空間補間のための二変量深絞り

Bivariate DeepKriging for Large-scale Spatial Interpolation of Wind Fields ( http://arxiv.org/abs/2307.08038v2 )

ライセンス: Link先を確認

Pratik Nag, Ying Sun, Brian J Reich,

(参考訳) 高空間分解能風速データは、気候、海洋学、気象学研究における幅広い応用に不可欠である。 2次元の速度を持つ二変量風の大規模空間補間または下降は、風データが高空間変動と不均一性を有する非ガウス的である傾向があるため、難しい課題である。空間統計学において、コクリギングは二変量空間場を予測するのに一般的に用いられる。しかし、コクリグ予測子はガウス過程を除いて最適ではない。さらに、コクリギングは大規模データセットでは計算が禁じられている。本稿では,2変数空間データ予測のための空間ラジアル基底関数によって構築された埋め込み層を備えた空間依存型ディープニューラルネットワーク(DNN)であるバイバリアレートディープクリグ法を提案する。そこで我々は,ブートストラップとアンサンブルDNNに基づく分布自由不確実性定量化手法を開発した。提案手法は,コリージョン化の線形モデルやフレキシブル二変量Mat\ern共分散などの共分散関数を用いた従来の共分散予測器よりも優れている。提案したDNNモデルの計算効率とスケーラビリティを,従来の手法に比べて平均20倍高速な計算で実証する。両変数のDeepKriging法を中東の506,771箇所の風速データに適用した。提案手法の予測性能はコクリグ予測よりも優れており,計算時間を劇的に短縮する。

High spatial resolution wind data are essential for a wide range of applications in climate, oceanographic and meteorological studies. Large-scale spatial interpolation or downscaling of bivariate wind fields having velocity in two dimensions is a challenging task because wind data tend to be non-Gaussian with high spatial variability and heterogeneity. In spatial statistics, cokriging is commonly used for predicting bivariate spatial fields. However, the cokriging predictor is not optimal except for Gaussian processes. Additionally, cokriging is computationally prohibitive for large datasets. In this paper, we propose a method, called bivariate DeepKriging, which is a spatially dependent deep neural network (DNN) with an embedding layer constructed by spatial radial basis functions for bivariate spatial data prediction. We then develop a distribution-free uncertainty quantification method based on bootstrap and ensemble DNN. Our proposed approach outperforms the traditional cokriging predictor with commonly used covariance functions, such as the linear model of co-regionalization and flexible bivariate Mat\'ern covariance. We demonstrate the computational efficiency and scalability of the proposed DNN model, with computations that are, on average, 20 times faster than those of conventional techniques. We apply the bivariate DeepKriging method to the wind data over the Middle East region at 506,771 locations. The prediction performance of the proposed method is superior over the cokriging predictors and dramatically reduces computation time.

公開日:2024-09-26
翻訳日:2024-11-09 14:51:04

# ポーラメカニクス:三重結合系における光子、マグノン、フォノン

Polaromechanics: photons, magnons and phonons in the triple strong-coupling regime ( http://arxiv.org/abs/2307.11328v3 )

ライセンス: Link先を確認

Rui-Chang Shen, Jie Li, Yi-Ming Sun, Wei-Jiang Wu, Xuan Zuo, Yi-Pu Wang, Shi-Yao Zhu, J. Q. You,

(参考訳) ハイブリッド量子システムの構築は、多機能量子技術、量子情報処理、ハイブリッド量子ネットワークを実現するための重要なステップである。関数型ハイブリッド量子系は、その成分間の強い結合を必要とする。しかし、異なる物理系間のカップリングは通常非常に弱い。ハイブリッドシステムにおける強い結合の実験的実現は、特に複数のコンポーネントを持ち、コンポーネントが異なる性質を持つ場合、長年にわたる課題である。ここでは、強結合された強磁性マグノンとマイクロ波光子によって形成される偏光子がフォノンとさらに強く結合する、新しいポーラメカニカルハイブリッドシステムにおける三重結合の実現を実証する。対応する偏光力学の正規モード分割が観察される。 9.4\times10^3$の高偏光力学的協調性は、コヒーレント完全吸収を利用して偏光子崩壊率を著しく減少させることによって達成される。量子コオペラティティがユニティよりもはるかに大きいのは、システムを低温に配置すれば達成できるため、様々な量子応用が可能となる。この結果は、光子、マグノン、フォノンのコヒーレントな量子制御への道を開くものであり、マグノンをベースとした関数型ハイブリッド量子システムを構築するための重要なステップである。

Building hybrid quantum systems is a crucial step for realizing multifunctional quantum technologies, quantum information processing, and hybrid quantum networks. A functional hybrid quantum system requires strong coupling among its components. However, couplings between distinct physical systems are typically very weak. Experimental realization of strong coupling in a hybrid system remains a long-standing challenge, especially when it has multiple components and the components are of different nature. Here we demonstrate the realization of triple strong coupling in a novel polaromechanical hybrid system, where polaritons, formed by strongly coupled ferromagnetic magnons and microwave photons, are further strongly coupled to phonons. The corresponding polaromechanical normal-mode splitting is observed. A high polaromechanical cooperativity of $9.4\times10^3$ is achieved by significantly reducing the polariton decay rate via exploiting coherent perfect absorption. The quantum cooperativity much greater than unity is achievable if placing the system at low bath temperatures, which would enable various quantum applications. Our results pave the way towards coherent quantum control of photons, magnons and phonons, and are a crucial step for building functional hybrid quantum systems based on magnons.

公開日:2024-09-27
翻訳日:2024-11-09 14:51:04

# 複素数を持つ論理ゲートについて

On Logic Gates with Complex Numbers ( http://arxiv.org/abs/2307.12905v6 )

ライセンス: Link先を確認

M. W. AlMasri,

(参考訳) 論理ゲートは複素微分作用素の言葉で書くことができ、入力と出力は複数の変数を持つ正則函数である。複素数の極表現を用いて、系の振動挙動と論理ゲートの間の即時接続に到達する。様々な計算システムにおけるこの形式主義の普遍性について論じる。

Logic gates can be written in terms of complex differential operators, where the inputs and outputs are holomorphic functions with several variables. Using the polar representation of complex numbers, we arrive at an immediate connection between the oscillatory behavior of the system and logic gates. We discuss the universality of this formalism in a variety of computing systems.

公開日:2024-10-10
翻訳日:2024-11-09 14:51:04

# 個人差分重み付き経験的リスク最小化手法とその出力重み付き学習への応用

A Differentially Private Weighted Empirical Risk Minimization Procedure and its Application to Outcome Weighted Learning ( http://arxiv.org/abs/2307.13127v2 )

ライセンス: Link先を確認

Spencer Giddens, Yiwang Zhou, Kevin R. Krull, Tara M. Brinkman, Peter X. K. Song, Fang Liu,

(参考訳) 個人情報を含むデータを用いて、経験的リスク最小化(ERM)の枠組みで予測モデルを構築するのが一般的である。これらのモデルは予測には非常に正確であるが、機密性の高いデータに基づいてトレーニングされたこれらのモデルの結果を共有することは、プライバシ攻撃の影響を受けやすい。差分プライバシー(DP)は、機密データから情報を公開する際に生じるプライバシー損失に数学的に証明可能な境界を提供することによって、そのようなデータプライバシー問題に対処するための魅力的なフレームワークである。これまでの作業は主に、未加重ERMにDPを適用することに集中してきた。重み付きERM (wERM) は, 目的関数に対する個々の貢献を様々な重みに割り当てることができる重要な一般化である。一般のwERMに対する最初の微分プライベートアルゴリズムを提案し、理論DPを保証する。既存のDP-ERMプロシージャをwERMに拡張することで、一般的な結果重み付き学習(OWL)を含む個別の処理ルールに対するプライバシー保護学習手法を導出する道が形成される。シミュレーションおよび実際の臨床試験において,OWLに適用したDP-wERMフレームワークの性能評価を行った。実験結果はすべて、十分な堅牢なモデル性能を維持しつつ、DP保証付きwERMによるOWLモデルのトレーニングが可能であることを示し、センシティブなデータを含む現実のシナリオにおいて、提案したプライバシ保存OWLプロシージャの実装の実用性を示す強力な証拠を提供する。

It is common practice to use data containing personal information to build predictive models in the framework of empirical risk minimization (ERM). While these models can be highly accurate in prediction, sharing the results from these models trained on sensitive data may be susceptible to privacy attacks. Differential privacy (DP) is an appealing framework for addressing such data privacy issues by providing mathematically provable bounds on the privacy loss incurred when releasing information from sensitive data. Previous work has primarily concentrated on applying DP to unweighted ERM. We consider weighted ERM (wERM), an important generalization, where each individual's contribution to the objective function can be assigned varying weights. We propose the first differentially private algorithm for general wERM, with theoretical DP guarantees. Extending the existing DP-ERM procedures to wERM creates a pathway for deriving privacy-preserving learning methods for individualized treatment rules, including the popular outcome weighted learning (OWL). We evaluate the performance of the DP-wERM framework applied to OWL in both simulation studies and in a real clinical trial. All empirical results demonstrate the feasibility of training OWL models via wERM with DP guarantees while maintaining sufficiently robust model performance, providing strong evidence for the practicality of implementing the proposed privacy-preserving OWL procedure in real-world scenarios involving sensitive data.

公開日:2024-09-27
翻訳日:2024-11-09 14:51:04

# 局所アドレス性に制限のある中性原子デバイスにおける回路分解とスケジューリング

Circuit decompositions and scheduling for neutral atom devices with limited local addressability ( http://arxiv.org/abs/2307.14996v2 )

ライセンス: Link先を確認

Natalia Nottingham, Michael A. Perlin, Dhirpal Shah, Ryan White, Hannes Bernien, Frederic T. Chong, Jonathan M. Baker,

(参考訳) 中性原子ハードウェア技術の進歩は続いているが、中性原子量子コンピュータの課題を克服するために設計されたシステムレベルのソフトウェアでは、まだ開発が限られている。特に、現在の中性原子アーキテクチャのほとんどは、ブロッホ球のxy平面の軸付近の1量子ビット回転の局所的なアドレッシングをネイティブにサポートしていない。代わりに、これらは全てのキュービットに同時に適用されるグローバルビームを介して実行される。従来の中性原子実験では、操作の短いシーケンスをこのネイティブゲートセットに変換する単純な合成法を使用していたが、これらの方法はシステムレベルのフレームワークに組み込むことも、非現実的なシリアライゼーションの量を課すことなく、回路全体に適用することもできない。十分なコンパイラ最適化がなければ、グローバルゲートを含む分解は回路深さ、ゲート数、エラーの蓄積を大幅に増加させる。この問題に対処する以前のコンパイラ作業はなく、この問題を解決するために既存のコンパイラを適用するのは簡単ではない。本稿では,任意のゲートセットからグローバルゲートを含むリアルな中性原子ネイティブゲートセットに入力回路を変換する最適化コンパイラパイプラインを提案する。最終回路のグローバルゲート数と全グローバルローテーション量を最小限に抑える分解とスケジューリングに焦点をあてる。示すように、これらのコストは、他のゲートタイプによるコストと比較して、回路の持続時間と全体的な誤差に最も寄与する。コンパイラパイプラインの最適化されていないバージョンと比較して、グローバルゲートコストの最小化は、回路長の最大4.77倍のスピードアップをもたらす。従来の作業と比べ、最大53.8倍のスピードアップを実現しています。大型回路では,回路の忠実度が若干向上している。

Despite major ongoing advancements in neutral atom hardware technology, there remains limited work in systems-level software tailored to overcoming the challenges of neutral atom quantum computers. In particular, most current neutral atom architectures do not natively support local addressing of single-qubit rotations about an axis in the xy-plane of the Bloch sphere. Instead, these are executed via global beams applied simultaneously to all qubits. While previous neutral atom experimental work has used straightforward synthesis methods to convert short sequences of operations into this native gate set, these methods cannot be incorporated into a systems-level framework nor applied to entire circuits without imposing impractical amounts of serialization. Without sufficient compiler optimizations, decompositions involving global gates will significantly increase circuit depth, gate count, and accumulation of errors. No prior compiler work has addressed this, and adapting existing compilers to solve this problem is nontrivial. In this paper, we present an optimized compiler pipeline that translates an input circuit from an arbitrary gate set into a realistic neutral atom native gate set containing global gates. We focus on decomposition and scheduling passes that minimize the final circuit's global gate count and total global rotation amount. As we show, these costs contribute the most to the circuit's duration and overall error, relative to costs incurred by other gate types. Compared to the unoptimized version of our compiler pipeline, minimizing global gate costs gives up to 4.77x speedup in circuit duration. Compared to the closest prior existing work, we achieve up to 53.8x speedup. For large circuits, we observe a few orders of magnitude improvement in circuit fidelities.

公開日:2024-09-23
翻訳日:2024-11-09 14:51:04

# RoboDepth Challenge:ロバスト深さ推定に向けた手法と進歩

The RoboDepth Challenge: Methods and Advancements Towards Robust Depth Estimation ( http://arxiv.org/abs/2307.15061v2 )

ライセンス: Link先を確認

Lingdong Kong, Yaru Niu, Shaoyuan Xie, Hanjiang Hu, Lai Xing Ng, Benoit R. Cottereau, Liangjun Zhang, Hesheng Wang, Wei Tsang Ooi, Ruijie Zhu, Ziyang Song, Li Liu, Tianzhu Zhang, Jun Yu, Mohan Jing, Pengwei Li, Xiaohua Qi, Cheng Jin, Yingfeng Chen, Jie Hou, Jie Zhang, Zhen Kan, Qiang Ling, Liang Peng, Minglei Li, Di Xu, Changpeng Yang, Yuanqi Yao, Gang Wu, Jian Kuai, Xianming Liu, Junjun Jiang, Jiamian Huang, Baojun Li, Jiale Chen, Shuang Zhang, Sun Ao, Zhenyu Li, Runze Chen, Haiyong Luo, Fang Zhao, Jingze Yu,

(参考訳) 悪天候, センサ故障, 騒音汚染など, アウト・オブ・ディストリビューション(OoD)のシナリオ下での正確な深度推定は, 安全クリティカルな応用に望ましい。しかし、既存の深度推定システムは、必然的に現実世界の腐敗や摂動に悩まされ、そのような場合の信頼性の高い深度予測に苦慮している。本稿では,頑健なOoD深度推定を容易にすることを目的とした学術コンペであるRoboDepth Challengeの優勝ソリューションを要約する。この問題は、新たに確立されたKITTI-CとNYUDepth2-Cベンチマークに基づいて開発された。我々は2つのスタンドアローントラックをホストし、それぞれ、頑健な自己監督と頑健な完全教師付き深度推定に重点を置いていた。 200人を超える参加者のうち、9つの独特で最高のソリューションが登場し、空間領域と周波数領域の強化、マスク付き画像モデリング、画像復元と超高解像度化、対向訓練、拡散に基づくノイズ抑圧、視覚言語による事前学習、学習モデルエンハンスブル、階層的特徴強化など、新しい設計がなされている。各設計の背景にある理論的根拠をよりよく理解するために、総合的な実験分析と洞察に富んだ観察を描いている。この課題が、堅牢で信頼性の高い深度推定などに関する将来の研究の確固たる基盤となることを願っている。データセット、競争ツールキット、ワークショップ記録、優勝チームのソースコードは、チャレンジウェブサイトで公開されている。

Accurate depth estimation under out-of-distribution (OoD) scenarios, such as adverse weather conditions, sensor failure, and noise contamination, is desirable for safety-critical applications. Existing depth estimation systems, however, suffer inevitably from real-world corruptions and perturbations and are struggled to provide reliable depth predictions under such cases. In this paper, we summarize the winning solutions from the RoboDepth Challenge -- an academic competition designed to facilitate and advance robust OoD depth estimation. This challenge was developed based on the newly established KITTI-C and NYUDepth2-C benchmarks. We hosted two stand-alone tracks, with an emphasis on robust self-supervised and robust fully-supervised depth estimation, respectively. Out of more than two hundred participants, nine unique and top-performing solutions have appeared, with novel designs ranging from the following aspects: spatial- and frequency-domain augmentations, masked image modeling, image restoration and super-resolution, adversarial training, diffusion-based noise suppression, vision-language pre-training, learned model ensembling, and hierarchical feature enhancement. Extensive experimental analyses along with insightful observations are drawn to better understand the rationale behind each design. We hope this challenge could lay a solid foundation for future research on robust and reliable depth estimation and beyond. The datasets, competition toolkit, workshop recordings, and source code from the winning teams are publicly available on the challenge website.

公開日:2024-09-24
翻訳日:2024-11-09 14:40:04

# 半無限導波路と結合した原子に基づく量子コヒーレント及び測定フィードバック制御

Quantum coherent and measurement feedback control based on atoms coupled with a semi-infinite waveguide ( http://arxiv.org/abs/2307.16876v4 )

ライセンス: Link先を確認

Haijin Ding, Nina H. Amini, Guofeng Zhang, John E. Gough,

(参考訳) 本稿では,複数の2レベル原子を結合した半無限導波路に基づく原子・フォトニック系の所望の状態を生成するために,量子フィードバック制御が適用可能であることを示す。このセットアップでは、初期励起原子が導波路に1つの光子を放出し、終端ミラーや他の原子によって反射され、原子と光子のコヒーレント相互作用を介して異なるフィードバックループを確立することができる。導波管量子電磁力学(導波管QED)系に少なくとも2つの励起が存在する場合、量子状態の進化はランダムグラフ理論を用いて解釈できる。このプロセスは環境の影響を受けながら,計測に基づくフィードバック制御やコヒーレントドライブによって環境誘起のダイナミクスを排除できることを明らかにする。したがって、オープン系原子-導波路相互作用において、測定に基づくフィードバックは最終的な定常量子状態を変調することができ、同時に、測定プロセスにおけるホモダイン検出ノイズは振動を誘発し、コヒーレントなフィードバック設計によって処理される。

In this paper, we show that quantum feedback control may be applied to generate desired states for atomic and photonic systems based on a semi-infinite waveguide coupled with multiple two-level atoms. In this set-up, an initially excited atom can emit one photon into the waveguide, which can be reflected by the terminal mirror or other atoms to establish different feedback loops via the coherent interactions between the atom and photon. When there are at most two excitations in the waveguide quantum electrodynamics (waveguide QED) system, the evolution of quantum states can be interpreted using random graph theory. While this process is influenced by the environment, and we clarify that the environment-induced dynamics can be eliminated by measurement-based feedback control or coherent drives. Thus, in the open system atom-waveguide interactions, measurement-based feedback can modulate the final steady quantum state, while simultaneously, the homodyne detection noise in the measurement process can induce oscillations, which is treated by the coherent feedback designs.

公開日:2024-09-24
翻訳日:2024-11-09 14:40:04

# 複数の固有値の位相シミュレーションのためのチャネルベースフレームワーク

Channel-based framework for phase esimation of multiple eigenvalues ( http://arxiv.org/abs/2308.02307v2 )

ライセンス: Link先を確認

Yuan-De Jin, Shi-Yu Zhang, Wen-Long Ma,

(参考訳) ターゲット量子系上のユニタリ演算子の固有値の量子位相推定(QPE)は、様々な量子アルゴリズムにおいて重要なサブルーチンである。従来のQPEは、多くのアンシラ量子ビットと量子フーリエ変換を実行する能力を必要とするため、実装に費用がかかることが多い。反復QPEの最近の進歩は、単一アンシラと古典的な後処理を繰り返し使用することにより、実装コストを削減している。しかし、従来型と反復型の両方のスキームでは、ユニタリ演算子の固有状態におけるターゲットシステムの準備が要求されるが、初期状態の準備を必要とせずに複数の固有値のQPEを達成することはあいまいである。ここでは、反復QPEのための逐次量子チャネルに基づく理論的枠組みを開発することにより、この問題を明らかにする。複数固有値のQPEを任意の初期目標系状態に対して効率よく実現し, 目標系における反復QPEの測定バックアクションを長いコヒーレンス時間で有効に活用できることを見出した。具体的には、アンシラ量子ビットの逐次ラムゼー干渉計測(RIM)に基づく2つの反復QPEスキームについて検討する。 (a) 固有値を推定する際の標準量子極限を達成するために反復RIMを実行する反復スキーム b) ハイゼンベルク限界に達するための事前測定結果に基づいて各RIMのパラメータを調整する適応型スキーム。どちらのスキームにおいても、連続的なアンシラ測定はターゲットシステム上で逐次的な量子チャネルを生成し、それを推定されたユニタリ演算子の固有状態に徐々にステアリングする一方、アンシラの測定統計は適切な後処理でその固有値に関する埋め込み情報を明らかにすることができる。本研究では, 中心スピンモデルを用いて解析を行い, 両スキームの性能と耐雑音性を評価する。

Quantum phase estimation (QPE) of the eigenvalues of a unitary operator on a target quantum system is a crucial subroutine in various quantum algorithms. Conventional QPE is often expensive to implement as it requires a large number of ancilla qubits and the ability to perform quantum Fourier transform. Recent developments in iterative QPE reduce the implementation cost by repetitive uses of a single ancilla and classical post-processing. However, both conventional and iterative schemes often require preparation of the target system in an eigenstate of the unitary operator, while it remains ambiguous to achieve QPE of multiple eigenvalues with no need of initial state preparation. Here we clarify this issue by developing a theoretical framework based on sequential quantum channels for iterative QPE. We find that QPE of multiple eigenvalues can be efficiently realized for arbitrary initial target system state by actively utilizing the measurement backaction of iterative QPE on the target system with a long coherence time. Specifically, we investigate two iterative QPE schemes based on sequential Ramsey interferometry measurements (RIMs) of an ancilla qubit: (a) the repetitive scheme, which conducts repetitive RIMs to achieve the standard quantum limit in estimating the eigenvalues; (b) the adaptive scheme, which adjusts the parameters of each RIM based on prior measurement outcomes to attain the Heisenberg limit. In both schemes, sequential ancilla measurements generate sequential quantum channels on the target system, gradually steering it to the eigenstates of the estimated unitary operator, while the measurement statistics of the ancilla can reveal the embedded information about its eigenvalues with proper post-processing. We demonstrate the analysis by simulating a central spin model, and evaluate the performance and noise resilience of both schemes.

公開日:2024-09-27
翻訳日:2024-11-09 14:40:04

# TempFuser: 長期の短期核融合変換器を使って、アジャイル、戦術、およびアクロバティックな飛行マニアを学ぶ

TempFuser: Learning Agile, Tactical, and Acrobatic Flight Maneuvers Using a Long Short-Term Temporal Fusion Transformer ( http://arxiv.org/abs/2308.03257v4 )

ライセンス: Link先を確認

Hyunki Seong, David Hyunchul Shim,

(参考訳) ドッグファイティングは、戦略的操作とアジャイル航空機の空気力学の両方を包括的に理解する必要がある航空アプリケーションにおいて難しいシナリオである。航空エージェントは、長期的視点から戦闘機の戦術的に進化する操縦を理解できるだけでなく、短期的な視点から航空機の空気力学を急速に変化させることも必要である。本稿では, 複雑なドッグファイト問題におけるアジャイル, 戦術的, アクロバティックな飛行操作を学習できる, 時間的長期統合型トランスフォーマーアーキテクチャである TempFuser を紹介する。当社のアプローチでは、2つの異なる時間的遷移の埋め込みをトランスフォーマーベースのネットワークに統合し、航空エージェントの長期的戦術と短期的機敏性の両方を包括的に捉える。これらの視点を取り入れることで、当社のポリシネットワークは、長期にわたって支配的な位置を確保し、効果的にアジャイル反対者を上回る、エンドツーエンドのフライトコマンドを生成します。高忠実度飛行シミュレーターで訓練した後、我々のモデルは戦略的な操作をうまく学習し、様々な種類の敵機に対して基本方針モデルより優れた性能を発揮する。特に,本モデルでは,先行知識を必要とせず,優れた仕様の敵に面しても,人間のようなアクロバティックな操作が可能である。さらに,超音速・低高度の課題において,強靭な追尾性能を示す。デモビデオはhttps://sites.google.com/view/tempfuser.comで公開されている。

Dogfighting is a challenging scenario in aerial applications that requires a comprehensive understanding of both strategic maneuvers and the aerodynamics of agile aircraft. The aerial agent needs to not only understand tactically evolving maneuvers of fighter jets from a long-term perspective but also react to rapidly changing aerodynamics of aircraft from a short-term viewpoint. In this paper, we introduce TempFuser, a novel long short-term temporal fusion transformer architecture that can learn agile, tactical, and acrobatic flight maneuvers in complex dogfight problems. Our approach integrates two distinct temporal transition embeddings into a transformer-based network to comprehensively capture both the long-term tactics and short-term agility of aerial agents. By incorporating these perspectives, our policy network generates end-to-end flight commands that secure dominant positions over the long term and effectively outmaneuver agile opponents. After training in a high-fidelity flight simulator, our model successfully learns to execute strategic maneuvers, outperforming baseline policy models against various types of opponent aircraft. Notably, our model exhibits human-like acrobatic maneuvers even when facing adversaries with superior specifications, all without relying on prior knowledge. Moreover, it demonstrates robust pursuit performance in challenging supersonic and low-altitude situations. Demo videos are available at https://sites.google.com/view/tempfuser.

公開日:2024-09-25
翻訳日:2024-11-09 14:40:04

# 量子コンピュータのためのファジィゲージ理論

Fuzzy gauge theory for quantum computers ( http://arxiv.org/abs/2308.05253v4 )

ライセンス: Link先を確認

Andrei Alexandru, Paulo F. Bedaque, Andrea Carosso, Michael J. Cervia, Edison M. Murairi, Andy Sheng,

(参考訳) 連続ゲージ理論は、そのボゾン次数により、無限次元局所ヒルベルト空間を持つ。量子ビットベースのハードウェア上でこれらの自由度を符号化するには、有限個の自由度しか使わずに理論の振舞いを近似するある種の「量子化」スキームが必要である。ファジィゲージ理論 (fuzzy gauge theory) と呼ばれるゲージ理論に対する新しい量子化戦略を提案し、ファジィ$\sigma$-モデルの成功に基づく。ファジィゲージ理論は正規ゲージ理論と同じ普遍性クラスに属し、その場合、通常の空間連続極限以外のいかなる極限も必要としない。さらに,これらのモデルが量子シミュレーションにおいて比較的資源効率が高いことを示す。

Continuous gauge theories, because of their bosonic degrees of freedom, have an infinite-dimensional local Hilbert space. Encoding these degrees of freedom on qubit-based hardware demands some sort of ``qubitization'' scheme, where one approximates the behavior of a theory while using only finitely many degrees of freedom. We propose a novel qubitization strategy for gauge theories, called ``fuzzy gauge theory,'' building on the success of the fuzzy $\sigma$-model in earlier work. We provide arguments that the fuzzy gauge theory lies in the same universality class as regular gauge theory, in which case its use would obviate the need of any further limit besides the usual spatial continuum limit. Furthermore, we demonstrate that these models are relatively resource-efficient for quantum simulations.

公開日:2024-09-24
翻訳日:2024-11-09 14:40:04

# CyberForce: マルウェア除去のためのフェデレーション強化学習フレームワーク

CyberForce: A Federated Reinforcement Learning Framework for Malware Mitigation ( http://arxiv.org/abs/2308.05978v3 )

ライセンス: Link先を確認

Chao Feng, Alberto Huertas Celdran, Pedro Miguel Sanchez Sanchez, Jan Kreischer, Jan von der Assen, Gerome Bovet, Gregorio Martinez Perez, Burkhard Stiller,

(参考訳) 近年の研究では、強化学習(RL)と移動目標防衛(MTD)の統合により、IoT(Internet-of-Things)デバイスにおけるサイバーセキュリティが向上することが示されている。それでも、既存の作業の実践性は、RLにおける集中型データ処理に関連するデータプライバシの懸念や、不均一なゼロデイ攻撃の増加に対して有効な適切なMTD技術を学ぶのに必要な不満足な時間によって妨げられている。この研究は、フェデレーションと強化学習(FRL)を組み合わせたフレームワークであるCyberForceを紹介し、ゼロデイ攻撃を緩和するための適切なMTDテクニックを共同でプライベートに学習する。 CyberForceはデバイスフィンガープリントと異常検出を統合して、FRLベースのエージェントによって選択されたMTDメカニズムを報酬または罰する。このフレームワークは、異種マルウェアのサンプルに影響された実際のIoTプラットフォームの10の物理デバイスで構成されたシナリオでデプロイされ、評価されている。実験のプールは、CyberForceが既存のRLベースの集中型アプローチよりも高速に攻撃を緩和するMTD技術を学ぶことを示した。さらに、様々なデバイスが異なる攻撃にさらされると、CyberForceは知識伝達の恩恵を受け、性能が向上し、最近の研究と比べて学習時間が短縮される。最後に、エージェント学習プロセスで使用される異なる集約アルゴリズムは、CyberForceに悪意のある攻撃に対する顕著な堅牢性を提供する。

Recent research has shown that the integration of Reinforcement Learning (RL) with Moving Target Defense (MTD) can enhance cybersecurity in Internet-of-Things (IoT) devices. Nevertheless, the practicality of existing work is hindered by data privacy concerns associated with centralized data processing in RL, and the unsatisfactory time needed to learn right MTD techniques that are effective against a rising number of heterogeneous zero-day attacks. Thus, this work presents CyberForce, a framework that combines Federated and Reinforcement Learning (FRL) to collaboratively and privately learn suitable MTD techniques for mitigating zero-day attacks. CyberForce integrates device fingerprinting and anomaly detection to reward or penalize MTD mechanisms chosen by an FRL-based agent. The framework has been deployed and evaluated in a scenario consisting of ten physical devices of a real IoT platform affected by heterogeneous malware samples. A pool of experiments has demonstrated that CyberForce learns the MTD technique mitigating each attack faster than existing RL-based centralized approaches. In addition, when various devices are exposed to different attacks, CyberForce benefits from knowledge transfer, leading to enhanced performance and reduced learning time in comparison to recent works. Finally, different aggregation algorithms used during the agent learning process provide CyberForce with notable robustness to malicious attacks.

公開日:2024-09-30
翻訳日:2024-11-09 14:40:04

# BehaVR:VRセンサデータに基づくユーザ識別

BehaVR: User Identification Based on VR Sensor Data ( http://arxiv.org/abs/2308.07304v2 )

ライセンス: Link先を確認

Ismat Jarin, Yu Duan, Rahmadi Trimananda, Hao Cui, Salma Elmalaki, Athina Markopoulou,

(参考訳) しかし、仮想現実(VR)プラットフォームは幅広いアプリケーションを可能にするが、ユニークなプライバシーリスクを生じさせる。特にVRデバイスには、個人的かつ機密性の高い情報(例えば、身体の動き、視線、手関節、表情など)を収集する、豊富なセンサーが備わっている。これらの新しいセンサーのデータは、明示的な識別子がなくても、ユーザーをユニークに識別するために使用することができる。本稿では,VRセンサデータのみに基づいて,さまざまなジャンルの現実世界のアプリ内外において,ユーザが特定できる範囲を理解することを目的とする。ひとつのアプリ(アプリ)で利用可能なAPIの観察から、複数のアプリ(デバイス)にまたがるすべてのまたは選択されたセンサ計測まで、さまざまな機能を持つ敵について検討する。そのために、BehaVRを紹介した。BehaVRは、VRデバイス上で実行される複数のアプリによって収集されたすべてのセンサグループからのデータを収集し、分析するフレームワークである。私たちはBehaVRを使って、20の人気のある現実世界のアプリと対話する実際のユーザーからデータを収集しています。そのデータを使って、アプリ内およびアプリ間のユーザ識別のための機械学習モデルを構築し、利用可能なセンサデータから機能を抽出します。これらのモデルがユーザを最大100%の精度で識別できることを示し、アプリや敵の機能に応じて、最も重要な機能やセンサグループを明らかにする。私たちの知る限りでは、BehaVRはVRにおけるユーザー識別を包括的に分析する最初の企業である。

Virtual reality (VR) platforms enable a wide range of applications, however, pose unique privacy risks. In particular, VR devices are equipped with a rich set of sensors that collect personal and sensitive information (e.g., body motion, eye gaze, hand joints, and facial expression). The data from these newly available sensors can be used to uniquely identify a user, even in the absence of explicit identifiers. In this paper, we seek to understand the extent to which a user can be identified based solely on VR sensor data, within and across real-world apps from diverse genres. We consider adversaries with capabilities that range from observing APIs available within a single app (app adversary) to observing all or selected sensor measurements across multiple apps on the VR device (device adversary). To that end, we introduce BehaVR, a framework for collecting and analyzing data from all sensor groups collected by multiple apps running on a VR device. We use BehaVR to collect data from real users that interact with 20 popular real-world apps. We use that data to build machine learning models for user identification within and across apps, with features extracted from available sensor data. We show that these models can identify users with an accuracy of up to 100%, and we reveal the most important features and sensor groups, depending on the functionality of the app and the adversary. To the best of our knowledge, BehaVR is the first to analyze user identification in VR comprehensively, i.e., considering all sensor measurements available on consumer VR devices, collected by multiple real-world, as opposed to custom-made, apps.

公開日:2024-09-23
翻訳日:2024-11-09 14:40:04

# コードLLMのための高リソースから低リソースプログラミング言語への知識伝達

Knowledge Transfer from High-Resource to Low-Resource Programming Languages for Code LLMs ( http://arxiv.org/abs/2308.09895v6 )

ライセンス: Link先を確認

Federico Cassano, John Gouwar, Francesca Lucchetti, Claire Schlesinger, Anders Freeman, Carolyn Jane Anderson, Molly Q Feldman, Michael Greenberg, Abhinav Jangda, Arjun Guha,

(参考訳) ここ数年、Large Language Models of Code (Code LLMs) はプログラミングの実践に大きな影響を与え始めています。プログラミング言語やソフトウェア工学の研究のためのビルディングブロックとして、コードLLMが登場している。しかし、Code LLMはトレーニングデータ(例えば、Java、Python、JavaScript)でよく表現されているが、トレーニングデータに制限のある低リソースの言語では苦労しているプログラミング言語に対して印象的な結果をもたらす。低リソース言語にはOCaml、Racket、その他いくつかのものがある。本稿では,半合成データを用いた低リソース言語上でのコードLLMの性能向上に有効な手法を提案する。我々のアプローチであるMultiPL-Tは、ハイソース言語からのトレーニングデータを、以下の方法で低リソース言語のトレーニングデータに変換する。 1) Code LLMを使用して、高ソース言語からのコメント付きコードのテストの合成を行い、欠陥のあるテストとテストカバレッジの低いコードをフィルタリングします。 2) コードLLMを使用してPythonコードをターゲットとする低リソース言語に翻訳し,テストを使用して翻訳を検証する。このアプローチを適用して,Julia,Lua,OCaml,R,Racketの各トレーニング項目を数万個生成する。さらに、オープンなトレーニングデータ(The Stack)を備えたオープンモデル(StarCoderBase)を使用することで、ベンチマークの削除や、ライセンスに違反することなくモデルをトレーニングし、それ以外の方法では不可能な実験を実行することが可能になります。 MultiPL-T 生成データを用いて,Julia,Lua,OCaml,R,Racket 用の StarCoderBase と Code Llama の微調整版を提示する。確立されたベンチマーク(MultiPL-E)では、これらのモデルは他のオープンコードLLMよりも優れている。 MultiPL-Tアプローチは、新しい言語に簡単に適用でき、トレーニングのような代替手段よりもはるかに効率的で効果的である。

Over the past few years, Large Language Models of Code (Code LLMs) have started to have a significant impact on programming practice. Code LLMs are also emerging as building blocks for research in programming languages and software engineering. However, Code LLMs produce impressive results on programming languages that are well represented in their training data (e.g., Java, Python, or JavaScript), but struggle with low-resource languages that have limited training data available. Low resource languages include OCaml, Racket, and several others. This paper presents an effective approach for boosting the performance of Code LLMs on low-resource languages using semi-synthetic data. Our approach, MultiPL-T, translates training data from high-resource languages into training data for low-resource languages in the following way. 1) We use a Code LLM to synthesize tests for commented code from a high-resource language, filtering out faulty tests and code with low test coverage. 2) We use a Code LLM to translate Python code to a target low-resource language, and use tests to validate the translation. We apply this approach to generate tens of thousands of validated training items for Julia, Lua, OCaml, R, and Racket. Furthermore, we use an open model (StarCoderBase) with open training data (The Stack), which allows us to decontaminate benchmarks, train models without violating licenses, and run experiments that could not otherwise be done. With MultiPL-T generated data, we present fine-tuned versions of StarCoderBase and Code Llama for Julia, Lua, OCaml, R, and Racket. On established benchmarks (MultiPL-E), these models outperform other open Code LLMs. The MultiPL-T approach is easy to apply to new languages, and is significantly more efficient and effective than alternatives such as training longer.

公開日:2024-09-22
翻訳日:2024-11-09 14:40:04

# ChatEDA:EDAのための大規模言語モデル駆動自律エージェント

ChatEDA: A Large Language Model Powered Autonomous Agent for EDA ( http://arxiv.org/abs/2308.10204v4 )

ライセンス: Link先を確認

Zhuolun He, Haoyuan Wu, Xinyun Zhang, Xufeng Yao, Su Zheng, Haisheng Zheng, Bei Yu,

(参考訳) 相互運用性を高めるための複雑な電子設計自動化(EDA)ツールの統合は、回路設計者にとって重要な関心事である。大規模言語モデル(LLM)の最近の進歩は、自然言語処理と理解において、EDAツールと対面する新しいアプローチを提供する、優れた能力を示した。本稿では,LEM,AutoMageによって権限を付与されたEDAの自律エージェントであるChatEDAを紹介し,執行役としてのEDAツールを補完する。 ChatEDAは、タスク分解、スクリプト生成、タスク実行を効果的に管理することで、登録-転送レベル(RTL)からグラフデータシステムバージョンII(GDSII)への設計フローを合理化する。総合的な実験評価を通じて,ChatEDAは多様な要求に対処する能力を示し,我々の微調整オートマージモデルはGPT-4や他のLLMと比較して優れた性能を示した。

The integration of a complex set of Electronic Design Automation (EDA) tools to enhance interoperability is a critical concern for circuit designers. Recent advancements in large language models (LLMs) have showcased their exceptional capabilities in natural language processing and comprehension, offering a novel approach to interfacing with EDA tools. This research paper introduces ChatEDA, an autonomous agent for EDA empowered by an LLM, AutoMage, complemented by EDA tools serving as executors. ChatEDA streamlines the design flow from the Register-Transfer Level (RTL) to the Graphic Data System Version II (GDSII) by effectively managing task decomposition, script generation, and task execution. Through comprehensive experimental evaluations, ChatEDA has demonstrated its proficiency in handling diverse requirements, and our fine-tuned AutoMage model has exhibited superior performance compared to GPT-4 and other similar LLMs.

公開日:2024-09-21
翻訳日:2024-11-09 14:40:04

# ローカル・ミニマを飛び抜ける:視覚変換器の失われた景観の量子化

Jumping through Local Minima: Quantization in the Loss Landscape of Vision Transformers ( http://arxiv.org/abs/2308.10814v3 )

ライセンス: Link先を確認

Natalia Frumkin, Dibakar Gope, Diana Marculescu,

(参考訳) 量子化スケールとビット幅は、ニューラルネットワークの量子化方法を考える上で最も重要なパラメータである。先行研究は、勾配法 (gradient descent \& Hessian analysis) を通じて、グローバルな方法で量子化スケールを最適化することに焦点を当てている。しかし、量子化スケールに摂動を適用すると、非常にジャグリングされ、非常に滑らかなテスト損失の風景が観察される。実際、量子化スケールでの小さな摂動は精度に大きな影響を与え、4ビット量子化ビジョントランス (ViT) において0.5-0.8\%の精度向上をもたらす。この体制では、勾配法は局所最小値に確実に到達できないため、崩壊する。 Evol-Qと呼ばれる我々の研究では、進化的探索を用いて非滑らかな風景を効果的に横断する。さらに我々は,小キャリブレーションデータセット(1,000ドル画像)のオーバーフィッティングに有効であるだけでなく,そのような非滑らかな表面のトラバースを容易にするインフォネッセロスを提案する。 Evol-Q は完全量子化された ViT-Base のトップ-1 の精度を 10.30 %$,$0.78 %$,$0.15 %$ で3$-bit,$4$-bit,$8$-bit で改善している。様々なCNNおよびViTアーキテクチャに関する大規模な実験は、極端量子化シナリオにおけるその堅牢性をさらに証明している。私たちのコードはhttps://github.com/enyac-group/evol-qで利用可能です。

Quantization scale and bit-width are the most important parameters when considering how to quantize a neural network. Prior work focuses on optimizing quantization scales in a global manner through gradient methods (gradient descent \& Hessian analysis). Yet, when applying perturbations to quantization scales, we observe a very jagged, highly non-smooth test loss landscape. In fact, small perturbations in quantization scale can greatly affect accuracy, yielding a $0.5-0.8\%$ accuracy boost in 4-bit quantized vision transformers (ViTs). In this regime, gradient methods break down, since they cannot reliably reach local minima. In our work, dubbed Evol-Q, we use evolutionary search to effectively traverse the non-smooth landscape. Additionally, we propose using an infoNCE loss, which not only helps combat overfitting on the small calibration dataset ($1,000$ images) but also makes traversing such a highly non-smooth surface easier. Evol-Q improves the top-1 accuracy of a fully quantized ViT-Base by $10.30\%$, $0.78\%$, and $0.15\%$ for $3$-bit, $4$-bit, and $8$-bit weight quantization levels. Extensive experiments on a variety of CNN and ViT architectures further demonstrate its robustness in extreme quantization scenarios. Our code is available at https://github.com/enyac-group/evol-q

公開日:2024-09-26
翻訳日:2024-11-09 14:40:04

# 時空間グラフ条件拡散モデルを用いた多変量時系列異常検出

Contaminated Multivariate Time-Series Anomaly Detection with Spatio-Temporal Graph Conditional Diffusion Models ( http://arxiv.org/abs/2308.12563v3 )

ライセンス: Link先を確認

Thi Kieu Khanh Ho, Narges Armanfard,

(参考訳) 主流の教師なし異常検出アルゴリズムは、しばしば学術データセットで優れているが、クリーンなトレーニングデータを含む制御された実験条件のため、実際の性能は制限されている。ノイズによるトレーニングの課題に対処するためには,現実的な異常検出の課題として,しばしば見落とされがちである。先駆的な試みとして,感覚時系列異常検出(TSAD)におけるラベルレベルのノイズの領域について検討した。本稿では,トレーニングデータを異常で汚染した場合に,新しいかつ実用的な非教師付きTSADを提案する。 TSAD-Cと呼ばれるアプローチでは、トレーニングフェーズ中に異常ラベルにアクセスできない。 TSAD-Cは、トレーニング中に発生する異常(いわゆるノイズ)を修正できるデコンタミネータ、純粋な正規データのサロゲートと見なされるデコンタミネートデータ内の長期的な内部および変数間の依存関係をキャプチャするロングレンジ可変依存性モデリングモジュール、あらゆるタイプの異常を検出するアノマリー・スコーリングモジュールの3つのコアモジュールを含んでいる。 TSAD-Cが既存の手法を超越し,TSAD分野における新たな最先端技術を確立したことを,信頼性と多種多様な4つのデータセットで実証した。

Mainstream unsupervised anomaly detection algorithms often excel in academic datasets, yet their real-world performance is restricted due to the controlled experimental conditions involving clean training data. Addressing the challenge of training with noise, a prevalent issue in practical anomaly detection, is frequently overlooked. In a pioneering endeavor, this study delves into the realm of label-level noise within sensory time-series anomaly detection (TSAD). This paper presents a novel and practical end-to-end unsupervised TSAD when the training data is contaminated with anomalies. The introduced approach, called TSAD-C, is devoid of access to abnormality labels during the training phase. TSAD-C encompasses three core modules: a Decontaminator to rectify anomalies (aka noise) present during training, a Long-range Variable Dependency Modeling module to capture long-term intra- and inter-variable dependencies within the decontaminated data that is considered as a surrogate of the pure normal data, and an Anomaly Scoring module to detect anomalies from all types. Our extensive experiments conducted on four reliable and diverse datasets conclusively demonstrate that TSAD-C surpasses existing methodologies, thus establishing a new state-of-the-art in the TSAD field.

公開日:2024-09-26
翻訳日:2024-11-09 14:40:04

# EECS学生のためのハンズオン量子プログラミング研究室

Hands-on Quantum Programming Labs for EECS Students ( http://arxiv.org/abs/2308.14002v5 )

ライセンス: Link先を確認

Janche Sang, Chansu Yu,

(参考訳) 本報告では,電子工学と計算機科学(EECS)の学生に,専用のプログラムラボを通じて量子コンピューティングを教える実践的なアプローチを提案する。実験室は様々なトピックをカバーしており、絡み合い、量子ゲート、回路、量子鍵分布、DeutschとDeutsch-Jozsaアルゴリズム、Simonのアルゴリズム、Groverのアルゴリズムといった先進的なアルゴリズムを含む。教育者として、現場にいる仲間のインストラクターと教えの洞察とリソースを共有することを目的としている。興味のあるインストラクターには、完全なラボハンドアウトとプログラムテンプレートが提供される。さらに、このレポートは、それぞれの実験の設計の背後にある理論的根拠を解明し、量子コンピューティングのより深い理解を可能にする。

This report presents a practical approach to teaching quantum computing to Electrical Engineering & Computer Science (EECS) students through dedicated hands-on programming labs. The labs cover a diverse range of topics, encompassing fundamental elements, such as entanglement, quantum gates and circuits, as well as advanced algorithms including Quantum Key Distribution, Deutsch and Deutsch-Jozsa Algorithms, Simon's algorithm, and Grover's algorithm. As educators, we aim to share our teaching insights and resources with fellow instructors in the field. The full lab handouts and program templates are provided for interested instructors. Furthermore, the report elucidates the rationale behind the design of each experiment, enabling a deeper understanding of quantum computing.

公開日:2024-09-23
翻訳日:2024-11-09 14:40:04

# LLM in the Shell: Generative Honeypots

LLM in the Shell: Generative Honeypots ( http://arxiv.org/abs/2309.00155v3 )

ライセンス: Link先を確認

Muris Sladić, Veronica Valeros, Carlos Catania, Sebastian Garcia,

(参考訳) ハニーポットはサイバーセキュリティにおいて、早期発見、脅威情報収集、攻撃者の行動分析に不可欠なツールである。しかし、そのほとんどは、人間の攻撃者を長期にわたって巻き込み、騙すために必要な現実主義を欠いている。ミツバチの区別が簡単であることは、その効果を強く妨げている。これは、決定論的すぎること、適応性の欠如、深みの欠如によって起こりうる。この研究は、Linuxライクなシェル出力を生成するLarge Language Modelsをベースとした、動的で現実的なソフトウェアハニーポットであるShelLMを導入している。我々はクラウドベースのLLMを用いてShelLMを設計・実装した。我々は,ShelLMが実Linuxシェルから期待通りに出力を生成できるかどうかを評価した。この評価は、サイバーセキュリティ研究者にハニーポットの使用を依頼し、ハニーポットからの回答がLinuxシェルから期待されているものであればフィードバックする。以上の結果から,ShelLMは現在のハニーポットの限界に対処できる信頼性と動的回答を創出できることが示唆された。 ShelLM は TNR 0.90 に達し、実際の Linux シェルと整合性があることを人間に納得させた。実験を複製するソースコードとプロンプトが公開されている。

Honeypots are essential tools in cybersecurity for early detection, threat intelligence gathering, and analysis of attacker's behavior. However, most of them lack the required realism to engage and fool human attackers long-term. Being easy to distinguish honeypots strongly hinders their effectiveness. This can happen because they are too deterministic, lack adaptability, or lack deepness. This work introduces shelLM, a dynamic and realistic software honeypot based on Large Language Models that generates Linux-like shell output. We designed and implemented shelLM using cloud-based LLMs. We evaluated if shelLM can generate output as expected from a real Linux shell. The evaluation was done by asking cybersecurity researchers to use the honeypot and give feedback if each answer from the honeypot was the expected one from a Linux shell. Results indicate that shelLM can create credible and dynamic answers capable of addressing the limitations of current honeypots. ShelLM reached a TNR of 0.90, convincing humans it was consistent with a real Linux shell. The source code and prompts for replicating the experiments have been publicly available.

公開日:2024-09-23
翻訳日:2024-11-09 14:40:04

# 古典的到着時間のモーダル変形

Moyal deformation of the classical arrival time ( http://arxiv.org/abs/2309.00222v4 )

ライセンス: Link先を確認

Dean Alvin L. Pablico, Eric A. Galapon,

(参考訳) 到着の量子時間(TOA)問題は、粒子の初期状態のみを仮定して測定された到着時間の統計を必要とする。量子論の標準的な枠組みに従って、この問題は古典的到着時刻 $\mathcal{T}_C(q,p)$ の適切な量子像を見つけることに変換される。本稿では、量子力学の位相空間定式化における問題を新たに考察する。得られた量子画像は実数値で時間反転対称関数 $\mathcal{T}_M(q,p)$ の形式的級数$\hbar^2$ であり、古典的到着時刻を主項とする。これはハミルトニアン系とのモヤルブラケット関係から直接得られ、したがって古典的TOAのモヤル変形として解釈される。その性質について検討し、$\mathcal{T}_M(q,p)$ と[Eur で構築されたヒルベルト空間 TOA 作用素の間の同型性を示すことによって、既知の障害物を量子化にバイパスする方法について議論する。 Phys J. Plus \textbf{138}, 153 (2023)] は任意の解析ポテンシャルに対して常に時間-エネルギーの正準交換関係(TECCR)を満たす。次に、自由粒子と準振動子ポテンシャルのTOA問題を例として考察する。

The quantum time of arrival (TOA) problem requires the statistics of measured arrival times given only the initial state of a particle. Following the standard framework of quantum theory, the problem translates into finding an appropriate quantum image of the classical arrival time $\mathcal{T}_C(q,p)$, usually in operator form $\hat{\mathrm{T}}$. In this paper, we consider the problem anew within the phase space formulation of quantum mechanics. The resulting quantum image is a real-valued and time-reversal symmetric function $\mathcal{T}_M(q,p)$ in formal series of $\hbar^2$ with the classical arrival time as the leading term. It is obtained directly from the Moyal bracket relation with the system Hamiltonian and is hence interpreted as a Moyal deformation of the classical TOA. We investigate its properties and discuss how it bypasses the known obstructions to quantization by showing the isomorphism between $\mathcal{T}_M(q,p)$ and the rigged Hilbert space TOA operator constructed in [Eur. Phys. J. Plus \textbf{138}, 153 (2023)] which always satisfy the time-energy canonical commutation relation (TECCR) for arbitrary analytic potentials. We then examine TOA problems for a free particle and a quartic oscillator potential as examples.

公開日:2024-09-27
翻訳日:2024-11-09 14:40:04

# ブリザード2023チャレンジにおけるフルートシェルフランスの合成システム

The FruitShell French synthesis system at the Blizzard 2023 Challenge ( http://arxiv.org/abs/2309.00223v3 )

ライセンス: Link先を確認

Xin Qi, Xiaopeng Wang, Zhiyong Wang, Wang Liu, Mingming Ding, Shuchen Shi,

(参考訳) 本稿では,Blizzard Challenge 2023のためのフランス語音声合成システムを提案する。この課題は、女性話者から高品質な音声を生成することと、特定の個人によく似た音声を生成することの2つのタスクから構成される。競合データについては,欠落したテキストデータや誤テキストデータを除去するスクリーニング処理を行った。音素以外のすべての記号を整理し,発音や持続時間を持たない記号を除去した。さらに、テキストに単語境界と開始/終了記号を追加し、過去の経験を基にした音声品質の向上を図った。 Spokeタスクでは,競合ルールに従ってデータ拡張を行った。我々は、オープンソースのG2Pモデルを使用して、フランス語のテキストを音素に書き起こした。 G2PモデルはIPA(International Phonetic Alphabet)を用いており、提案した競合データに同じ書き起こし処理を適用して標準化した。しかし、IPAチャートから特殊記号を認識する際のコンパイラの制限により、全ての音素を競合データに使用する音素に変換する規則に従った。最後に,全競合音声を均一サンプリングレート16kHzに再サンプリングした。ハイフィガンボコーダを用いたVITSを用いた音響モデルを用いた。 Spokeタスクでは,複数話者モデルを訓練し,モデルの持続時間予測器,ボコーダ,フロー層に話者情報を組み込んだ。システム評価の結果,Hubタスクが3.6,Spokeタスクが3.4,システムの平均レベルが全参加チーム中の平均値となった。

This paper presents a French text-to-speech synthesis system for the Blizzard Challenge 2023. The challenge consists of two tasks: generating high-quality speech from female speakers and generating speech that closely resembles specific individuals. Regarding the competition data, we conducted a screening process to remove missing or erroneous text data. We organized all symbols except for phonemes and eliminated symbols that had no pronunciation or zero duration. Additionally, we added word boundary and start/end symbols to the text, which we have found to improve speech quality based on our previous experience. For the Spoke task, we performed data augmentation according to the competition rules. We used an open-source G2P model to transcribe the French texts into phonemes. As the G2P model uses the International Phonetic Alphabet (IPA), we applied the same transcription process to the provided competition data for standardization. However, due to compiler limitations in recognizing special symbols from the IPA chart, we followed the rules to convert all phonemes into the phonetic scheme used in the competition data. Finally, we resampled all competition audio to a uniform sampling rate of 16 kHz. We employed a VITS-based acoustic model with the hifigan vocoder. For the Spoke task, we trained a multi-speaker model and incorporated speaker information into the duration predictor, vocoder, and flow layers of the model. The evaluation results of our system showed a quality MOS score of 3.6 for the Hub task and 3.4 for the Spoke task, placing our system at an average level among all participating teams.

公開日:2024-09-25
翻訳日:2024-11-09 14:40:04

# 置換不変エンコーダとより厳密な変動目標を用いた多モード生成モデルの学習

Learning multi-modal generative models with permutation-invariant encoders and tighter variational objectives ( http://arxiv.org/abs/2309.00380v3 )

ライセンス: Link先を確認

Marcel Hirt, Domenico Campolo, Victoria Leong, Juan-Pablo Ortega,

(参考訳) マルチモーダルデータに対する深い潜伏変数モデルの開発は、機械学習研究において長年のテーマであった。マルチモーダル変分オートエンコーダ(VAE)は、複数のモーダルを共同で説明する潜在表現を学習する一般的な生成モデルクラスである。このようなモデルに対する様々な目的関数が提案され、しばしばマルチモーダルデータ対数や情報理論的な考察から下界として動機付けられる。異なるモダリティ部分集合から潜在変数を符号化するために、Product-of-Experts(PoE)またはMixture-of-Experts(MoE)アグリゲーションスキームが日常的に使われ、例えば、複数のモダリティにわたる生成品質や一貫性に関して、異なるトレードオフをもたらすことが示されている。本研究では,データログ類似度を厳密に近似できる変動目標について考察する。我々は、置換不変ニューラルネットワークに基づく異なるモーダル性から符号化された特徴を組み合わせることにより、PoEやMoEアプローチの帰納バイアスを回避する、より柔軟なアグリゲーションスキームを開発する。数値解析実験では、多モード変動目的と様々なアグリゲーションスキームのトレードオフについて述べる。同定可能なモデルにおいて、観測されたモジュラリティと潜伏変数の真の関節分布を近似したい場合、我々の変動目的およびより柔軟な凝集モデルが有益であることが示される。

Devising deep latent variable models for multi-modal data has been a long-standing theme in machine learning research. Multi-modal Variational Autoencoders (VAEs) have been a popular generative model class that learns latent representations that jointly explain multiple modalities. Various objective functions for such models have been suggested, often motivated as lower bounds on the multi-modal data log-likelihood or from information-theoretic considerations. To encode latent variables from different modality subsets, Product-of-Experts (PoE) or Mixture-of-Experts (MoE) aggregation schemes have been routinely used and shown to yield different trade-offs, for instance, regarding their generative quality or consistency across multiple modalities. In this work, we consider a variational objective that can tightly approximate the data log-likelihood. We develop more flexible aggregation schemes that avoid the inductive biases in PoE or MoE approaches by combining encoded features from different modalities based on permutation-invariant neural networks. Our numerical experiments illustrate trade-offs for multi-modal variational objectives and various aggregation schemes. We show that our variational objective and more flexible aggregation models can become beneficial when one wants to approximate the true joint distribution over observed modalities and latent variables in identifiable models.

公開日:2024-09-24
翻訳日:2024-11-09 14:40:04

# ロバストオンライン分類:見積もりからデノイングへ

Robust Online Classification: From Estimation to Denoising ( http://arxiv.org/abs/2309.01698v2 )

ライセンス: Link先を確認

Changlong Wu, Ananth Grama, Wojciech Szpankowski,

(参考訳) 一般仮説クラスを用いて,特徴のオンライン分類をラベルに分類する。我々の設定では、真のラベルは仮説クラス内の何らかの関数によって決定されるが、未知の確率ノイズによって破損し、その特徴は逆向きに生成される。観測されたノイズラベルとノイズレス特徴を用いて予測を行い、真のラベルと比較した場合の最小リスクを用いて性能を計測する。ノイズ機構は、個々のデータポイントに対して、実際のノイズラベル分布が選択された分布のセットを指定する一般的なノイズカーネルを介してモデル化される。提案手法は,カーネルが誘導するノイズラベル分布のHellingerギャップによって(仮説クラスサイズの対数係数まで)極小リスクを強く特徴付け,ノイズの手段や分散といった他の特性に依存しないことを示す。本手法は,オンライン設定に適したLe Cam-Birg\'eテストの条件付きバージョンとともに,2つの仮説のオンライン比較スキームへの新規な削減に基づく。本研究は,一般の雑音観測に対処しながら,基礎的真理を保証した,ノイズの多いオンライン分類の包括的特徴を初めて提供する。

We study online classification of features into labels with general hypothesis classes. In our setting, true labels are determined by some function within the hypothesis class but are corrupted by unknown stochastic noise, and the features are generated adversarially. Predictions are made using observed noisy labels and noiseless features, while the performance is measured via minimax risk when comparing against true labels. The noise mechanism is modeled via a general noise kernel that specifies, for any individual data point, a set of distributions from which the actual noisy label distribution is chosen. We show that minimax risk is tightly characterized (up to a logarithmic factor of the hypothesis class size) by the Hellinger gap of the noisy label distributions induced by the kernel, independent of other properties such as the means and variances of the noise. Our main technique is based on a novel reduction to an online comparison scheme of two hypotheses, along with a new conditional version of Le Cam-Birg\'e testing suitable for online settings. Our work provides the first comprehensive characterization for noisy online classification with guarantees with respect to the ground truth while addressing general noisy observations.

公開日:2024-09-25
翻訳日:2024-11-09 14:40:04

# 逐次ボリューム設計課題のための表現学習

Representation Learning for Sequential Volumetric Design Tasks ( http://arxiv.org/abs/2309.02583v2 )

ライセンス: Link先を確認

Md Ferdous Alam, Yi Wang, Chin-Yi Cheng, Jieliang Luo,

(参考訳) ボリュームデザイン(英: volumetric design)は、マスキングデザインとも呼ばれる、プロの建築設計における最初の重要なステップであり、本質的にはシーケンシャルである。ボリューム設計プロセスは慎重な設計決定と反復的な調整を必要とするため、基礎となるシーケンシャル設計プロセスはデザイナーにとって貴重な情報をエンコードする。合理的なボリューム設計を自動生成するための多くの努力がなされているが、生成した設計ソリューションの品質は様々であり、設計ソリューションを評価するには、極めて包括的なメトリクスセットか、高価な人間の専門知識が必要である。従来,設計課題ではなく最終設計の学習に焦点をあてたアプローチでは,設計知識を専門家や高性能な設計シーケンスの集合から符号化し,トランスフォーマーモデルを用いて有用な表現を抽出することを提案した。後日、設計選好評価や手続き設計生成といった重要な下流アプリケーションにおいて、学習した表現を活用することを提案する。本研究では,学習した表現の密度を推定して嗜好モデルを開発する一方で,逐次設計生成のための自己回帰変換モデルを訓練する。数千のシーケンシャルなボリュームデザインの新たなデータセットを活用することで、私たちのアイデアを実証する。我々の選好モデルは、任意に与えられた2つの設計シーケンスを比較することができ、ランダムな設計シーケンスに対する評価において約90\%の精度を持つ。我々の自己回帰モデルは、部分設計シーケンスからボリューム設計シーケンスを自動補完することも可能である。

Volumetric design, also called massing design, is the first and critical step in professional building design which is sequential in nature. As the volumetric design process requires careful design decisions and iterative adjustments, the underlying sequential design process encodes valuable information for designers. Many efforts have been made to automatically generate reasonable volumetric designs, but the quality of the generated design solutions varies, and evaluating a design solution requires either a prohibitively comprehensive set of metrics or expensive human expertise. While previous approaches focused on learning only the final design instead of sequential design tasks, we propose to encode the design knowledge from a collection of expert or high-performing design sequences and extract useful representations using transformer-based models. Later we propose to utilize the learned representations for crucial downstream applications such as design preference evaluation and procedural design generation. We develop the preference model by estimating the density of the learned representations whereas we train an autoregressive transformer model for sequential design generation. We demonstrate our ideas by leveraging a novel dataset of thousands of sequential volumetric designs. Our preference model can compare two arbitrarily given design sequences and is almost $90\%$ accurate in evaluation against random design sequences. Our autoregressive model is also capable of autocompleting a volumetric design sequence from a partial design sequence.

公開日:2024-09-24
翻訳日:2024-11-09 14:40:04

# 窒素空孔電子スピン欠陥の制御可能性限界の定量化

Quantifying the limits of controllability for the nitrogen-vacancy electron spin defect ( http://arxiv.org/abs/2309.03120v2 )

ライセンス: Link先を確認

Paul Kairys, Jonathan C. Marcks, Nazar Delegan, Jiefei Zhang, David D. Awschalom, F. Joseph Heremans,

(参考訳) ダイヤモンドの窒素空孔中心のような固体電子スピン量子ビットは、感度を高めデバイスコヒーレンスを改善するために、集団反転の制御配列に依存している。しかし、このパラダイムシステムでさえ、集団反転の基本的な限界と量子センシングのような応用に対する潜在的な影響は定量的に評価されていない。ここでは、隣り合う核スピンの明示的なユニタリシミュレーションを含む、回転波近似を超えた高精度なシミュレーションを行う。量子最適制御を用いて、スピン-1基底状態内の量子ビット部分空間の制御のための解析パルスを同定し、パルス複雑性、制御時間、忠実度の関係を定量化する。制御期間を短縮した振幅と帯域幅の要求を指数関数的に増加させ,さらにサブナノ秒集団インバージョンを用いたマルチパルス列に対する非マルコフ効果の出現を定量化する。このことから、還元された忠実度と非マルコフ性は、電子スピンと核スピン環境とのコヒーレントな相互作用に起因すると判定する。最終的には、高忠実度多重パルス列に対するナノ秒制御の潜在的実現可能な機構を同定する。これらの結果は、ダイヤモンドの電子スピン欠陥を用いた量子情報処理の基本的な限界に関する重要な洞察を与える。

Solid-state electron spin qubits, like the nitrogen-vacancy center in diamond, rely on control sequences of population inversion to enhance sensitivity and improve device coherence. But even for this paradigmatic system, the fundamental limits of population inversion and potential impacts on applications like quantum sensing have not been assessed quantitatively. Here, we perform high accuracy simulations beyond the rotating wave approximation, including explicit unitary simulation of neighboring nuclear spins. Using quantum optimal control, we identify analytical pulses for the control of a qubit subspace within the spin-1 ground state and quantify the relationship between pulse complexity, control duration, and fidelity. We find exponentially increasing amplitude and bandwidth requirements with reduced control duration and further quantify the emergence of non-Markovian effects for multipulse sequences using sub-nanosecond population inversion. From this, we determine that the reduced fidelity and non-Markovianity is due to coherent interactions of the electron spin with the nuclear spin environment. Ultimately, we identify a potentially realizable regime of nanosecond control duration for high-fidelity multipulse sequences. These results provide key insights into the fundamental limits of quantum information processing using electron spin defects in diamond.

公開日:2024-09-24
翻訳日:2024-11-09 14:40:04

# 地中真実の生成:ソフトラベルとラベルノイズ研究のための合成データ

Generating the Ground Truth: Synthetic Data for Soft Label and Label Noise Research ( http://arxiv.org/abs/2309.04318v2 )

ライセンス: Link先を確認

Sjoerd de Vries, Dirk Thierens,

(参考訳) 多くの実世界の分類タスクにおいて、ラベルノイズは機械学習モデルの一般化誤差に悪影響を及ぼす避けられない問題である。また, クリーンなラベルを使わずに, ラベルノイズが性能に与える影響を正確に定量化できないため, このようなノイズの処理方法の評価は困難である。ラベルノイズに関する既存の研究は、通常、ノイズまたは単純化されたシミュレーションデータをベースラインとして依存し、既知の特性を持つ追加ノイズを注入する。本稿では,これらの制約に対処するためのフレームワークであるSynLABELを紹介する。 SynLABELは、事前指定または学習された関数を基底真理関数として定義することをサポートし、新しいクリーンラベルの生成に使用できる。さらに、関数の領域内で選択された特徴の値を繰り返し再サンプリングし、関数を評価し、その結果のラベルを集約することにより、各データポイントにソフトラベルまたはラベル分布を割り当てることができる。これらの分布は多くの実世界のデータセットに存在する固有の不確実性を捉え、ラベルノイズの直接注入と定量化を可能にする。生成されたデータセットは、さまざまな種類のノイズを導入可能な、調整可能な複雑性のクリーンなベースラインとして機能する。さらに、ソフトラベル学習と関連する応用の研究を促進する。我々はSynLABELの応用を実演し、ラベルノイズを正確に定量化し、既存の手法よりも改善したことを示す。

In many real-world classification tasks, label noise is an unavoidable issue that adversely affects the generalization error of machine learning models. Additionally, evaluating how methods handle such noise is complicated, as the effect label noise has on their performance cannot be accurately quantified without clean labels. Existing research on label noise typically relies on either noisy or oversimplified simulated data as a baseline, into which additional noise with known properties is injected. In this paper, we introduce SYNLABEL, a framework designed to address these limitations by creating noiseless datasets informed by real-world data. SYNLABEL supports defining a pre-specified or learned function as the ground truth function, which can then be used for generating new clean labels. Furthermore, by repeatedly resampling values for selected features within the domain of the function, evaluating the function and aggregating the resulting labels, each data point can be assigned a soft label or label distribution. These distributions capture the inherent uncertainty present in many real-world datasets and enable the direct injection and quantification of label noise. The generated datasets serve as a clean baseline of adjustable complexity, into which various types of noise can be introduced. Additionally, they facilitate research into soft label learning and related applications. We demonstrate the application of SYNLABEL, showcasing its ability to precisely quantify label noise and its improvement over existing methodologies.

公開日:2024-09-23
翻訳日:2024-11-09 14:40:04

# 量子コンピュータのためのリアルタイム・スケーラブル・高速・高資源なデコーダ

A real-time, scalable, fast and highly resource efficient decoder for a quantum computer ( http://arxiv.org/abs/2309.05558v2 )

ライセンス: Link先を確認

Ben Barber, Kenton M. Barnes, Tomasz Bialas, Okan Buğdaycı, Earl T. Campbell, Neil I. Gillespie, Kauser Johar, Ram Rajan, Adam W. Richardson, Luka Skoric, Canberk Topal, Mark L. Turner, Abbas B. Ziad,

(参考訳) 量子コンピュータの可能性を解き放つためには、量子ビットの性能に対するノイズ効果を慎重に管理する必要がある。ノイズによって引き起こされる計算エラーを診断するデコーダは、大きな量子ビット数へのスケーリングと低温動作を可能にするために、リソースを効率的に利用しなければならない。さらに、量子コンピュータの論理クロックレートが指数関数的に遅くなるのを避けるために、速度で動作する必要がある。このような課題を克服するために、Collision Clusteringデコーダを導入し、FPGAおよびASICハードウェア上で実装する。量子誤り訂正方式, 表面符号を用いて論理記憶実験をシミュレーションし, 超伝導量子ビットなどの高速動作モードの要求に合致するMHz復号速度をFPGAとASICでそれぞれ851および1057キュービット表面コードに近似した。 ASIC の設計は 0.06 mm$^2$ であり、わずか 8 mW の電力しか消費しない。我々のデコーダは高い性能とリソース効率を持ち、フォールトトレラントな量子コンピュータを実現するための実行可能な道を開く。

To unleash the potential of quantum computers, noise effects on qubits' performance must be carefully managed. The decoders responsible for diagnosing noise-induced computational errors must use resources efficiently to enable scaling to large qubit counts and cryogenic operation. Additionally, they must operate at speed, to avoid an exponential slowdown in the logical clock rate of the quantum computer. To overcome such challenges, we introduce the Collision Clustering decoder and implement it on FPGA and ASIC hardware. We simulate logical memory experiments using the leading quantum error correction scheme, the surface code, and demonstrate MHz decoding speed - matching the requirements of fast-operating modalities such as superconducting qubits - up to an 881 and 1057 qubits surface code with the FPGA and ASIC, respectively. The ASIC design occupies 0.06 mm$^2$ and consumes only 8 mW of power. Our decoder is both highly performant and resource efficient, unlocking a viable path to practically realising fault-tolerant quantum computers.

公開日:2024-09-24
翻訳日:2024-11-09 14:28:50

# 短絡-断熱による高忠実度マクロ微視的重ね合わせ状態

High fidelity macroscopic superposition states via shortcut to adiabaticity ( http://arxiv.org/abs/2309.06031v2 )

ライセンス: Link先を確認

Mehdi Aslani, Vahid Salari, Mehdi Abdi,

(参考訳) 巨視的空間重畳状態の大規模物体を調製するために, 断熱方式のショートカットを提案する。本稿では, トラップ電位をパラボラから二重井戸に調整しながら, 即時ハミルトニアンの基底状態におけるシステム維持に反断熱駆動を用いることを提案する。これは、制御パラメータを適切に傾斜させて行われる。いくつかの反断熱ドライブは、ほとんどのケースで十分であることを示す。この実装のために超伝導回路のハイブリッド電気機械構成を提案する。本手法の効率は,ノイズや不完全性の存在下でのシステムの力学を数値的に解くことで評価される。その結果,高忠実度で空間的に識別可能な猫状態を持つ機械共振器をプロトコルを用いて作成できることが示唆された。さらに、このプロトコルはノイズや不完全性に対して堅牢である。また、結合回路電気力学キャビティモードの分光による最終状態の検証手法についても検討する。我々の研究は、将来の実験において、マクロな重ね合わせ状態を実現し、検証するための基礎研究として役立てることができる。

A shortcut to an adiabatic scheme is proposed for preparing a massive object in a macroscopic spatial superposition state. In this scheme we propose to employ counterdiabatic driving to maintain the system in the ground state of its instantaneous Hamiltonian while the trap potential is tuned from a parabola to a double well. This, in turn, is performed by properly ramping a control parameter. We show that a few counterdiabatic drives are enough for most practical cases. A hybrid electromechanical setup in superconducting circuits is proposed for the implementation. The efficiency of our scheme is benchmarked by numerically solving the system dynamics in the presence of noises and imperfections. The results show that a mechanical resonator with very-high-fidelity spatially distinguishable cat states can be prepared with our protocol. Furthermore, the protocol is robust against noises and imperfections. We also discuss a method for verifying the final state via spectroscopy of a coupled circuit electrodynamical cavity mode. Our work can serve as the ground work to feasibly realize and verify macroscopic superposition states in future experiments.

公開日:2024-09-26
翻訳日:2024-11-09 14:28:50

# 大規模言語モデルにおけるRe-Readingの改善

Re-Reading Improves Reasoning in Large Language Models ( http://arxiv.org/abs/2309.06275v3 )

ライセンス: Link先を確認

Xiaohan Xu, Chongyang Tao, Tao Shen, Can Xu, Hongbo Xu, Guodong Long, Jian-guang Lou, Shuai Ma,

(参考訳) 既成のLarge Language Models (LLMs) の推論能力を高めるために, 単純で汎用的で効果的なプロンプト手法であるRe2を導入する。出力の推論プロセスを引き出すことを目的としたChain-of-Thought (CoT) のような、ほとんどの思考上の促進方法とは異なり、Re2 は質問を2回処理することで、入力に焦点を移し、理解プロセスを強化する。その結果、Re2 は CoT を含むほとんどの思考依存的プロンプト手法との強い一般化と互換性を示す。重要なことに、Re2は、第1パスが第2パスのグローバル情報を提供するため、一方向デコーダのみのLLMで"双方向"エンコーディングを容易にする。まず、Re2の基礎となる予備的な実証研究から始め、その「双方向」注意機構の実現の可能性を示す。その後、14のデータセットにわたる広範囲な推論ベンチマークでRe2を評価し、112の実験にまたがって、その有効性と汎用性を検証する。以上の結果から,バニラChatGPTではいくつかのシナリオを除いて,Re2は単純な再読解戦略によってLCMの推論性能を一貫して向上させることがわかった。さらなる分析により、Re2の適応性を明らかにし、異なるLLMと効果的に統合する方法、思考の緩和、アンサンブル戦略を示す。私たちのコードは \url{https://github.com/Tebmer/reading-LLM-Reasoning/} で利用可能です。

To enhance the reasoning capabilities of off-the-shelf Large Language Models (LLMs), we introduce a simple, yet general and effective prompting method, Re2, i.e., \textbf{Re}-\textbf{Re}ading the question as input. Unlike most thought-eliciting prompting methods, such as Chain-of-Thought (CoT), which aim to elicit the reasoning process in the output, Re2 shifts the focus to the input by processing questions twice, thereby enhancing the understanding process. Consequently, Re2 demonstrates strong generality and compatibility with most thought-eliciting prompting methods, including CoT. Crucially, Re2 facilitates a "bidirectional" encoding in unidirectional decoder-only LLMs because the first pass could provide global information for the second pass. We begin with a preliminary empirical study as the foundation of Re2, illustrating its potential to enable "bidirectional" attention mechanisms. We then evaluate Re2 on extensive reasoning benchmarks across 14 datasets, spanning 112 experiments, to validate its effectiveness and generality. Our findings indicate that, with the exception of a few scenarios on vanilla ChatGPT, Re2 consistently enhances the reasoning performance of LLMs through a simple re-reading strategy. Further analyses reveal Re2's adaptability, showing how it can be effectively integrated with different LLMs, thought-eliciting prompting, and ensemble strategies. Our code is available at \url{https://github.com/Tebmer/Rereading-LLM-Reasoning/}

公開日:2024-09-21
翻訳日:2024-11-09 14:28:50

# (ほぼ)量子ベルの不等式とデバイス非依存の応用

(Almost-)Quantum Bell Inequalities and Device-Independent Applications ( http://arxiv.org/abs/2309.06304v4 )

ライセンス: Link先を確認

Yuan Liu, Ho Yiu Chung, Ravishankar Ramanathan,

(参考訳) 近年、量子ベルの不等式の導出による量子相関の境界に関する調査が注目されているが、これはツィレルソンの問題と関連しており、DI情報処理に重要な応用がある。しかし、量子ベルの不等式を決定することは、非常に難しい課題であり、孤立した例のみが知られている。本稿では、(ほぼ)量子ベルの不等式(英語版)のファミリーを提示し、3つの基礎的およびDI的応用に焦点を当てる。第一に、符号なし境界上の量子相関は弱い源からのDIランダム性抽出において重要である。 2つのkアウトカム測定を持つ2人のプレイヤーの現実的なベルシナリオでは、量子ベルの不等式を導出し、4k-8の非符号境界の特定の部分から量子境界を分離し、前の結果を拡張する。直近の副産物として、量子系に対するオーマンの合意定理とほぼ量子相関の一般的な証明を与える。これは、オーマンの合意定理が、一般的な非符号理論から量子理論とほぼ量子相関の両方を選ぶための、疫学の文脈における合理的な物理原理であることを意味する。第二に、m二乗測定シナリオを持つ2人のプレイヤーに量子ベルの不等式(英語版)の族を提示し、2量子ビットのシングルレットと2mの測定を自己検証する。興味深いことに、この主張はTsirelson-Landau-Masanesによって発見された m=2 の結果を一般化し、最先端の DIRA よりも改善されたことを示す。最後に、量子ベルの不等式を用いて、量子相関集合を特徴づける情報理論の原理である非局所計算における優位性の原理の一般形を導出する。これにより、これまでに知られている量子境界の最も正確な特徴を与える。

Investigations of the boundary of the quantum correlation set through the derivation of quantum Bell inequalities have gained increased attention in recent years, which are related to Tsirelson's problem and have significant applications in DI information processing. However, determining quantum Bell inequalities is a notoriously difficult task and only isolated examples are known. In this paper, we present families of (almost-)quantum Bell inequalities and highlight three foundational and DI applications. Firstly, quantum correlations on the non-signaling boundary are crucial in the DI randomness extraction from weak sources. In the practical Bell scenario of two players with two k-outcome measurements, we derive quantum Bell inequalities that show a separation of the quantum boundary from certain portions of the no-signaling boundary of dimension up to 4k-8, extending previous results. As an immediate by-product of this, we give a general proof of Aumann's Agreement theorem for quantum systems and the almost-quantum correlations, which implies Aumann's agreement theorem is a reasonable physical principle in the context of epistemics to pick out both quantum theory and almost-quantum correlations from general no-signaling theories. Secondly, we present a family of quantum Bell inequalities in the two players with m binary measurements scenarios, that serve to self-test the two-qubit singlet and 2m measurements. Interestingly, this claim generalizes the result for m=2 discovered by Tsirelson-Landau-Masanes and shows an improvement over the state-of-the-art DIRA. Lastly, we use our quantum Bell inequalities to derive the general form of the principle of no advantage in nonlocal computation, which is an information-theoretic principle that serves to characterize the quantum correlation set. With this, we provide the most precise characterization of the quantum boundary known so far.

公開日:2024-09-27
翻訳日:2024-11-09 14:28:50

# $\texttt{NePhi}$: およそ拡散型医用画像登録のためのニューラルな変形場

$\texttt{NePhi}$: Neural Deformation Fields for Approximately Diffeomorphic Medical Image Registration ( http://arxiv.org/abs/2309.07322v3 )

ライセンス: Link先を確認

Lin Tian, Hastings Greer, Raúl San José Estépar, Roni Sengupta, Marc Niethammer,

(参考訳) この研究は、およそ微分同相変換をもたらす一般化可能なニューラル変形モデルNePhiを提案する。学習ベースの登録アプローチで使用される主要なボクセルベースの変換フィールドとは対照的に、NePhiは変形を関数的に表現し、トレーニングや推論、推論時間、登録精度、変換規則性といったメモリ消費の設計空間において大きな柔軟性をもたらす。具体的には、NePhi 1) ボクセルベースの学習手法に比べてメモリ消費は少ない。 2) 既存のニューラル変形に基づく登録手法が最適化に依存しているのに対して,潜時符号の予測により推論速度が向上する。 3)インスタンス最適化による精度の向上,および 4) 医用画像登録に好適な変形規則性を示した。実際の3次元医用画像データセット(肺や脳など)と同様に,2次元合成データセット上でのNePhiの性能を実証する。以上の結果から,NePhiは単一解像度の登録設定において,ボクセルに基づく表現の精度に適合できることがわかった。マルチレゾリューション登録では、現在のSOTA学習に基づく登録手法とインスタンス最適化の精度を一致させ、メモリ要求を5倍に削減する。私たちのコードはhttps://github.com/uncbiag/NePhi.comで公開されています。

This work proposes NePhi, a generalizable neural deformation model which results in approximately diffeomorphic transformations. In contrast to the predominant voxel-based transformation fields used in learning-based registration approaches, NePhi represents deformations functionally, leading to great flexibility within the design space of memory consumption during training and inference, inference time, registration accuracy, as well as transformation regularity. Specifically, NePhi 1) requires less memory compared to voxel-based learning approaches, 2) improves inference speed by predicting latent codes, compared to current existing neural deformation based registration approaches that \emph{only} rely on optimization, 3) improves accuracy via instance optimization, and 4) shows excellent deformation regularity which is highly desirable for medical image registration. We demonstrate the performance of NePhi on a 2D synthetic dataset as well as for real 3D medical image datasets (e.g., lungs and brains). Our results show that NePhi can match the accuracy of voxel-based representations in a single-resolution registration setting. For multi-resolution registration, our method matches the accuracy of current SOTA learning-based registration approaches with instance optimization while reducing memory requirements by a factor of five. Our code is available at https://github.com/uncbiag/NePhi.

公開日:2024-09-27
翻訳日:2024-11-09 14:28:50

# C-Pack:中国の一般的な埋め込みのためのパッケージ化リソース

C-Pack: Packed Resources For General Chinese Embeddings ( http://arxiv.org/abs/2309.07597v5 )

ライセンス: Link先を確認

Shitao Xiao, Zheng Liu, Peitian Zhang, Niklas Muennighoff, Defu Lian, Jian-Yun Nie,

(参考訳) C-Packは、一般的な中国の埋め込みの分野を著しく前進させるリソースのパッケージである。 C-Packには3つの重要なリソースが含まれている。 1) C-MTEBは6つのタスクと35のデータセットをカバーする中国語テキスト埋め込みの総合ベンチマークである。 2) C-MTPは, ラベル付き, ラベルなしの中国語コーパスを用いて, 埋め込みモデルを訓練するための大量のテキスト埋め込みデータセットである。 3) C-TEMは、複数のサイズをカバーする埋め込みモデルのファミリーである。弊社のモデルは、C-MTEB上の以前の中国語のテキスト埋め込みを、リリース時に最大で10%上回っている。また、C-TEMのための一連のトレーニング方法を統合し、最適化します。一般的な中国語の埋め込みに関するリソースに加えて、英語のテキスト埋め込みのためのデータとモデルもリリースしています。 MTEBベンチマークでは、英語モデルは最先端のパフォーマンスを達成していますが、我々のリリースした英語データは、中国のデータより2倍も大きいのです。これらのリソースはすべてhttps://github.com/FlagOpen/FlagEmbedding.comで公開されています。

We introduce C-Pack, a package of resources that significantly advance the field of general Chinese embeddings. C-Pack includes three critical resources. 1) C-MTEB is a comprehensive benchmark for Chinese text embeddings covering 6 tasks and 35 datasets. 2) C-MTP is a massive text embedding dataset curated from labeled and unlabeled Chinese corpora for training embedding models. 3) C-TEM is a family of embedding models covering multiple sizes. Our models outperform all prior Chinese text embeddings on C-MTEB by up to +10% upon the time of the release. We also integrate and optimize the entire suite of training methods for C-TEM. Along with our resources on general Chinese embedding, we release our data and models for English text embeddings. The English models achieve state-of-the-art performance on MTEB benchmark; meanwhile, our released English data is 2 times larger than the Chinese data. All these resources are made publicly available at https://github.com/FlagOpen/FlagEmbedding.

公開日:2024-09-24
翻訳日:2024-11-09 14:28:50

# Spectrum-Aware Debiasing - 主要コンポーネントの回帰処理を応用した現代的な推論フレームワーク

Spectrum-Aware Debiasing: A Modern Inference Framework with Applications to Principal Components Regression ( http://arxiv.org/abs/2309.07810v4 )

ライセンス: Link先を確認

Yufan Li, Pragya Sur,

(参考訳) 偏見は高次元統計学における基本的な概念である。自由度調整は、高次元線形回帰における最先端技術である一方、これはi.d.サンプルと亜ガウス共変量に限られる。これらの制約は、その広範な実用性を妨げている。本稿では,高次元回帰のための新しい手法であるSpectrum-Aware Debiasingを紹介する。我々のアプローチは、構造化された依存関係、重いテール、低ランク構造に関する問題に適用されます。提案手法は, サンプル共分散行列のスペクトル情報を用いて再スケーリング係数を導出し, 再スケール勾配降下ステップによるデバイアス化を実現する。スペクトルベースのアプローチは、より広い文脈での正確な偏りの除去を可能にする。特徴量とサンプル数が比例的にスケールする共通近代体制を考察する。我々は、共変量体が右回転不変であるとき、様々な収束概念の下で、提案した推定器の漸近正規性(好適に中心化およびスケール化)を確立する。このような設計は、圧縮センシングにおいて重要な役割を担っているため、近年注目を集めている。さらに、その漸近的分散に対する一貫した推定器を考案する。まず、主成分回帰(PCR)のバイアスを補正するためにSpectrum-Aware Debiasingを使用し、高次元における最初の脱バイアスPCR推定器を提供する。第2に、サンプル共分散行列の信号と固有ベクトルとの整合性を確認するための原理的テストを導入する。このテストは、近似メッセージパッシング(英語版)、Leave-one-out(英語版)、凸ガウスのmin-max定理(英語版)を用いて開発された統計手法には独立に有用である。シミュレーションおよび実データ実験により本手法を実証する。技術的には、近似メッセージパッシングアルゴリズムとデバイアスを結合し、ベクトル近似メッセージパッシング(V-AMP)のコーシー性の最初の証明を提供する。

Debiasing is a fundamental concept in high-dimensional statistics. While degrees-of-freedom adjustment is the state-of-the-art technique in high-dimensional linear regression, it is limited to i.i.d. samples and sub-Gaussian covariates. These constraints hinder its broader practical use. Here, we introduce Spectrum-Aware Debiasing--a novel method for high-dimensional regression. Our approach applies to problems with structured dependencies, heavy tails, and low-rank structures. Our method achieves debiasing through a rescaled gradient descent step, deriving the rescaling factor using spectral information of the sample covariance matrix. The spectrum-based approach enables accurate debiasing in much broader contexts. We study the common modern regime where the number of features and samples scale proportionally. We establish asymptotic normality of our proposed estimator (suitably centered and scaled) under various convergence notions when the covariates are right-rotationally invariant. Such designs have garnered recent attention due to their crucial role in compressed sensing. Furthermore, we devise a consistent estimator for its asymptotic variance. Our work has two notable by-products: first, we use Spectrum-Aware Debiasing to correct bias in principal components regression (PCR), providing the first debiased PCR estimator in high dimensions. Second, we introduce a principled test for checking alignment between the signal and the eigenvectors of the sample covariance matrix. This test is independently valuable for statistical methods developed using approximate message passing, leave-one-out, or convex Gaussian min-max theorems. We demonstrate our method through simulated and real data experiments. Technically, we connect approximate message passing algorithms with debiasing and provide the first proof of the Cauchy property of vector approximate message passing (V-AMP).

公開日:2024-10-04
翻訳日:2024-11-09 14:28:50

# 量子干渉による重力相互作用ダークマターの検出

Detecting Gravitationally Interacting Dark Matter with Quantum Interference ( http://arxiv.org/abs/2309.08238v3 )

ライセンス: Link先を確認

Alejandro Perez, Carlo Rovelli, Marios Christodoulou,

(参考訳) ダークマターの存在を示す大きな天文学的な証拠にもかかわらず、ダークマターの性質は謎のままである。特に量子重力の基本的なスケールであるプランク質量周辺の質量と相互作用する粒子は、興味深い候補となっている。ここでは、高感度重力による量子位相シフトを用いて、そのような粒子を直接検出する理論的可能性を示す。特に、ジョセフソン接合を利用したプロトコルを考える。

In spite or the large astronomical evidence for its existence, the nature of dark matter remains enigmatic. Particles that interact only, or almost only, gravitationally, in particular with masses around the Planck mass -- the fundamental scale in quantum gravity, are intriguing candidates. Here we show that there is a theoretical possibility to directly detect such particles using highly sensitive gravity-mediated quantum phase shifts. In particular, we consider a protocol utilizing Josephson junctions.

公開日:2024-09-27
翻訳日:2024-11-09 14:28:50

# フルオロベンゼン中の電子ウェーブレットのイオン化と励起によるアトケミカル量子干渉のシグナル

Signature of attochemical quantum interference upon ionization and excitation of an electronic wavepacket in fluoro-benzene ( http://arxiv.org/abs/2309.08269v3 )

ライセンス: Link先を確認

Anthony Ferté, Dane Austin, Allan S. Johnson, Felicity McGrath, João Pedro Malhado, Jon P. Marangos, Morgane Vacher,

(参考訳) ウルトラショートパルスは分子を励起またはイオン化し、コヒーレントな電子ウェーブパケットを凝集させ、複雑なダイナミクスを引き起こす。本研究では, ベンゼンとフッ化ベンゼン分子の異なる電子波束へのイオン化に伴う結合電子核動力学を, 量子力学的および全次元でシミュレートする。フルオロベンゼンでは、計算は状態間および状態内量子干渉の両方を解き、アトケミカルの明確なシグネチャと自己相関関数の形状における電荷方向のダイナミクスを残せる。後者はベンゼンとフルオロベンゼンの実験的な高調波分光測定と一致している。

Ultrashort pulses can excite or ionize molecules and populate coherent electronic wavepackets, inducing complex dynamics. In this work, we simulate the coupled electron-nuclear dynamics upon ionization to different electronic wavepackets of (deuterated) benzene and fluoro-benzene molecules, quantum mechanically and in full dimensionality. In fluoro-benzene, the calculations unravel both inter-state and intra-state quantum interferences that leave clear signatures of attochemistry and charge-directed dynamics in the shape of the autocorrelation function. The latter are in agreement with experimental high harmonic spectroscopy measurements of benzenes and fluoro-benzene.

公開日:2024-09-23
翻訳日:2024-11-09 14:28:50

# YCB-Ev 1.1:6DoFオブジェクトポーズ推定のためのイベントビジョンデータセット

YCB-Ev 1.1: Event-vision dataset for 6DoF object pose estimation ( http://arxiv.org/abs/2309.08482v2 )

ライセンス: Link先を確認

Pavel Rojtberg, Thomas Pöllabauer,

(参考訳) 本研究は,同期RGB-Dフレームとイベントデータを含むYCB-Evデータセットを導入し,これらのモダリティを用いた6DoFオブジェクトポーズ推定アルゴリズムの評価を可能にする。このデータセットは、YCB-Video(YCB-V)データセットで使用されたのと同じ21のYCBオブジェクトに対して、6DoFオブジェクトのポーズを提供する。データセットは21の同期イベントとRGB-Dシーケンスで構成され、合計で13,851フレーム(7分43秒)である。特に、これらのシーケンスのうち12は、BOPチャレンジで使用されるYCB-Vサブセットと同じオブジェクト配列である。地中真実のポーズは、RGB-Dフレーム内のオブジェクトを検出し、イベントタイムスタンプに合わせるためにポーズを補間し、外的キャリブレーションを用いてイベント座標フレームに転送することで生成される。私たちのデータセットは、イベントストリームに6DoFのポーズデータを提供する最初のものです。さらに,新しいYCB-Vシークエンスを用いて,BOPチャレンジのために事前学習された2つの最先端アルゴリズムの一般化能力を評価する。データセットはhttps://github.com/paroj/ycbev.comで公開されている。

Our work introduces the YCB-Ev dataset, which contains synchronized RGB-D frames and event data that enables evaluating 6DoF object pose estimation algorithms using these modalities. This dataset provides ground truth 6DoF object poses for the same 21 YCB objects that were used in the YCB-Video (YCB-V) dataset, allowing for cross-dataset algorithm performance evaluation. The dataset consists of 21 synchronized event and RGB-D sequences, totalling 13,851 frames (7 minutes and 43 seconds of event data). Notably, 12 of these sequences feature the same object arrangement as the YCB-V subset used in the BOP challenge. Ground truth poses are generated by detecting objects in the RGB-D frames, interpolating the poses to align with the event timestamps, and then transferring them to the event coordinate frame using extrinsic calibration. Our dataset is the first to provide ground truth 6DoF pose data for event streams. Furthermore, we evaluate the generalization capabilities of two state-of-the-art algorithms, which were pre-trained for the BOP challenge, using our novel YCB-V sequences. The dataset is publicly available at https://github.com/paroj/ycbev.

公開日:2024-09-25
翻訳日:2024-11-09 14:28:50

# 量子擬似ランダムスクランブラ

Quantum Pseudorandom Scramblers ( http://arxiv.org/abs/2309.08941v2 )

ライセンス: Link先を確認

Chuhan Lu, Minglong Qin, Fang Song, Penghui Yao, Mingnan Zhao,

(参考訳) 量子擬似ランダム状態発生器(PRSG)は近年、エキサイティングな発展を促している。固定初期(例えば全ゼロ)状態のPSRGは、Haarランダム状態と計算的に区別できない出力状態を生成する。しかし、出力状態の擬似ランダム性は他の初期状態では保証されない。実際、既知のPSSG構造はいくつかの初期状態で確実に失敗する。本研究では、任意の初期状態上で擬似乱数状態を生成する量子擬似乱数状態スクランブラ(PRSS)を提案し、構築する。情報理論的な設定では、任意の初期状態を全変動距離におけるハールランダムに近い量子状態の分布にマッピングするスクランブラを得る。その結果,スクランブラーは分散特性を示した。一般には、状態空間の$\epsilon$-netにまたがることができる。このことは、平均出力状態がハールランダム状態に近似するならば、状態空間の小さな領域のみに集中できるため、標準PSRGが誘導できるものを大幅に強化する。我々のPRSS構造は有名なKacの歩行を平行に拡張し、標準のKacの歩行よりも指数関数的に高速に混合することを示す。これは我々の証明の核となる。 PRSSの応用についても述べる。 PRSSの構成は、量子後片道関数を仮定するが、PRSSはより弱いプリミティブであり、標準PSSGと同様の相対化世界の片道関数から分離することができる。

Quantum pseudorandom state generators (PRSGs) have stimulated exciting developments in recent years. A PRSG, on a fixed initial (e.g., all-zero) state, produces an output state that is computationally indistinguishable from a Haar random state. However, pseudorandomness of the output state is not guaranteed on other initial states. In fact, known PRSG constructions provably fail on some initial states. In this work, we propose and construct quantum Pseudorandom State Scramblers (PRSSs), which can produce a pseudorandom state on an arbitrary initial state. In the information-theoretical setting, we obtain a scrambler which maps an arbitrary initial state to a distribution of quantum states that is close to Haar random in total variation distance. As a result, our scrambler exhibits a dispersing property. Loosely, it can span an $\epsilon$-net of the state space. This significantly strengthens what standard PRSGs can induce, as they may only concentrate on a small region of the state space provided that average output state approximates a Haar random state. Our PRSS construction develops a parallel extension of the famous Kac's walk, and we show that it mixes exponentially faster than the standard Kac's walk. This constitutes the core of our proof. We also describe a few applications of PRSSs. While our PRSS construction assumes a post-quantum one-way function, PRSSs are potentially a weaker primitive and can be separated from one-way functions in a relativized world similar to standard PRSGs.

公開日:2024-09-22
翻訳日:2024-11-09 14:28:50

# OWL:IT運用のための大規模言語モデル

OWL: A Large Language Model for IT Operations ( http://arxiv.org/abs/2309.09298v2 )

ライセンス: Link先を確認

Hongcheng Guo, Jian Yang, Jiaheng Liu, Liqun Yang, Linzheng Chai, Jiaqi Bai, Junran Peng, Xiaorong Hu, Chao Chen, Dongfeng Zhang, Xu Shi, Tieqiao Zheng, Liangfan Zheng, Bo Zhang, Ke Xu, Zhoujun Li,

(参考訳) IT運用の急速な発展に伴い、実用的なアプリケーションのために大量のデータを効率的に管理し、分析することがますます重要になっている。自然言語処理(NLP)の技術は、名前付きエンティティ認識、機械翻訳、対話システムなど、様々なタスクに顕著な能力を示している。最近、Large Language Models (LLM) は様々なNLPダウンストリームタスクで大幅に改善されている。しかし、IT運用には特殊なLLMが欠如している。本稿では,収集したOWL-Instructデータセットに基づいて学習した大規模言語モデルOWLを紹介する。さらに、当社が確立したOWL-Bench上でのOWLの性能を評価し、IT関連ベンチマークをオープンにする。 OWLはITタスクにおける優れたパフォーマンス結果を示しており、既存のモデルをかなり上回っている。さらに、私たちの研究の成果が、専門的なLLMでIT運用の技術に革命をもたらすことを願っています。

With the rapid development of IT operations, it has become increasingly crucial to efficiently manage and analyze large volumes of data for practical applications. The techniques of Natural Language Processing (NLP) have shown remarkable capabilities for various tasks, including named entity recognition, machine translation and dialogue systems. Recently, Large Language Models (LLMs) have achieved significant improvements across various NLP downstream tasks. However, there is a lack of specialized LLMs for IT operations. In this paper, we introduce the OWL, a large language model trained on our collected OWL-Instruct dataset with a wide range of IT-related information, where the mixture-of-adapter strategy is proposed to improve the parameter-efficient tuning across different domains or tasks. Furthermore, we evaluate the performance of our OWL on the OWL-Bench established by us and open IT-related benchmarks. OWL demonstrates superior performance results on IT tasks, which outperforms existing models by significant margins. Moreover, we hope that the findings of our work will provide more insights to revolutionize the techniques of IT operations with specialized LLMs.

公開日:2024-09-27
翻訳日:2024-11-09 14:28:50

# 古典的あるいは量子二項最適化を用いた任意の線形方程式系を解く反復アルゴリズムの収束性の改善

Improving the convergence of an iterative algorithm for solving arbitrary linear equation systems using classical or quantum binary optimization ( http://arxiv.org/abs/2309.09933v3 )

ライセンス: Link先を確認

Erick R. Castro, Eldues O. Martins, Roberto S. Sarthour, Alexandre M. Souza, Ivan S. Oliveira,

(参考訳) 量子コンピューティングと量子に触発されたアルゴリズムの最近の進歩は、バイナリ最適化に新たな関心を喚起している。これらのハードウェアとソフトウェア革新は、複雑な問題に対するソリューションタイムに革命をもたらすことを約束する。本研究では,線形システムの解法を提案する。提案手法は二項最適化を利用しており,特に条件数の多い問題に適している。線形系を二進最適化問題に変換し、元の問題の幾何学からインスピレーションを得て、共役勾配法に類似する。このアプローチでは、アルゴリズムの収束率を著しく加速する共役方向を用いる。さらに本研究では,問題の内在的幾何の部分的知識を活用することにより,元の問題をより小さく独立したサブプロブレムに分解できることを実証する。これらのサブプロブレムは量子または古典的な解法を用いて効率的に取り組める。問題の幾何を決定することは計算コストの増大をもたらすが、この投資は既存の手法に比べてかなりの性能向上に勝っている。

Recent advancements in quantum computing and quantum-inspired algorithms have sparked renewed interest in binary optimization. These hardware and software innovations promise to revolutionize solution times for complex problems. In this work, we propose a novel method for solving linear systems. Our approach leverages binary optimization, making it particularly well-suited for problems with large condition numbers. We transform the linear system into a binary optimization problem, drawing inspiration from the geometry of the original problem and resembling the conjugate gradient method. This approach employs conjugate directions that significantly accelerate the algorithm's convergence rate. Furthermore, we demonstrate that by leveraging partial knowledge of the problem's intrinsic geometry, we can decompose the original problem into smaller, independent sub-problems. These sub-problems can be efficiently tackled using either quantum or classical solvers. While determining the problem's geometry introduces some additional computational cost, this investment is outweighed by the substantial performance gains compared to existing methods.

公開日:2024-09-27
翻訳日:2024-11-09 14:28:50

# XY相互作用による省エネルギー量子回路の合成

Synthesis of Energy-Conserving Quantum Circuits with XY interaction ( http://arxiv.org/abs/2309.11051v3 )

ライセンス: Link先を確認

Ge Bai, Iman Marvian,

(参考訳) 我々は、$\sqrt{iSWAP}$ゲートとより一般的には、XX+YY相互作用だけで実現できるエンタングルゲートから構築された量子回路について研究する。このようなゲートは計算ベースで状態のハミング重みを保ち、これはz軸周りの回転に対応する大域的U(1)対称性を尊重することを意味する。同様に、系内の各キュービットの内在的ハミルトニアンがパウリZ作用素であると仮定すると、系全体のエネルギーは保存される。我々は,z軸まわりの単一ビット回転の有無にかかわらず,XX+YY相互作用を用いて所望のエネルギー保存ユニタリを実現する回路を効率的に合成する方法を開発した。興味深いことに、CCZやFredkinゲートのような一般的なエネルギー保存単位を2つの局所的なエネルギー保存ゲートで実装するには、アンシラ量子ビットを使用する必要がある。 z軸周りの1量子回転が許されるとき、我々のスキームは1つのアンシラ量子ビットしか必要としないが、XX+YY相互作用だけでは2つのアンシラ量子ビットを必要とする。正確な実現に加えて、近似現実化についても検討し、$\sqrt{iSWAP}$ gates と 2 個の補助量子ビットの列のみを用いて一般エネルギー保存ユニタリをいかに合成できるかを示し、ソロヴィ・キタエフの定理を通じて有界な小さな誤差を持つ。我々の方法は、XX+YY相互作用ではなく、ハイゼンベルク交換相互作用のような計算ベースでは対角的でない他のエネルギー保存2体相互作用にアクセスできる場合、エネルギー保存ユニタリの合成にも応用できる。量子コンピューティング、量子熱力学、量子時計の文脈におけるこれらの回路の応用について簡単に論じる。

We study quantum circuits constructed from $\sqrt{iSWAP}$ gates and, more generally, from the entangling gates that can be realized with the XX+YY interaction alone. Such gates preserve the Hamming weight of states in the computational basis, which means they respect the global U(1) symmetry corresponding to rotations around the z axis. Equivalently, assuming that the intrinsic Hamiltonian of each qubit in the system is the Pauli Z operator, they conserve the total energy of the system. We develop efficient methods for synthesizing circuits realizing any desired energy-conserving unitary using XX+YY interaction with or without single-qubit rotations around the z-axis. Interestingly, implementing generic energy-conserving unitaries, such as CCZ and Fredkin gates, with 2-local energy-conserving gates requires the use of ancilla qubits. When single-qubit rotations around the z-axis are permitted, our scheme requires only a single ancilla qubit, whereas with the XX+YY interaction alone, it requires 2 ancilla qubits. In addition to exact realizations, we also consider approximate realizations and show how a general energy-conserving unitary can be synthesized using only a sequence of $\sqrt{iSWAP}$ gates and 2 ancillary qubits, with arbitrarily small error, which can be bounded via the Solovay-Kitaev theorem. Our methods are also applicable for synthesizing energy-conserving unitaries when, rather than the XX+YY interaction, one has access to any other energy-conserving 2-body interaction that is not diagonal in the computational basis, such as the Heisenberg exchange interaction. We briefly discuss the applications of these circuits in the context of quantum computing, quantum thermodynamics, and quantum clocks.

公開日:2024-09-23
翻訳日:2024-11-09 14:28:50

# EPTQ: Hessian-Guided Network-wise Optimization による学習後量子化の強化

EPTQ: Enhanced Post-Training Quantization via Hessian-guided Network-wise Optimization ( http://arxiv.org/abs/2309.11531v2 )

ライセンス: Link先を確認

Ofir Gordon, Elad Cohen, Hai Victor Habi, Arnon Netzer,

(参考訳) 量子化は、メモリと計算リソースが限られているエッジデバイスにディープニューラルネットワークをデプロイするための重要な方法である。ポストトレーニング量子化法(PTQ)の最近の改良は、重み量子化ラウンドリングポリシーを学習するための局所最適化プロセスによって達成された。しかし、小さな代表データセットでネットワークワイズ最適化を採用する場合、ギャップが存在する。本稿では,ネットワークワイド量子化最適化プロセスを利用するEPTQ(Advanced PTQ)の新たな手法を提案する。 EPTQは,ラベルフリーなヘッセン行列上界に基づく新しいサンプル層アテンションスコアを用いた,小さな代表データセットによるネットワークワイズ最適化を実現する。ラベルのない手法はPTQ方式に適合する。以上の境界について理論的解析を行い、それを用いて、より繊細な層やサンプルに焦点を合わせるよう最適化する知識蒸留損失を構築する。さらに,重みテンソルの高感度要素に着目し,重み量子化パラメータの選択を改善するためにヘッセン上界を利用する。 EPTQを用いることで、ImageNet分類、COCOオブジェクト検出、意味的セグメンテーションのためのPascal-VOCなど、さまざまなモデル、タスク、データセットの最先端結果が得られる。

Quantization is a key method for deploying deep neural networks on edge devices with limited memory and computation resources. Recent improvements in Post-Training Quantization (PTQ) methods were achieved by an additional local optimization process for learning the weight quantization rounding policy. However, a gap exists when employing network-wise optimization with small representative datasets. In this paper, we propose a new method for enhanced PTQ (EPTQ) that employs a network-wise quantization optimization process, which benefits from considering cross-layer dependencies during optimization. EPTQ enables network-wise optimization with a small representative dataset using a novel sample-layer attention score based on a label-free Hessian matrix upper bound. The label-free approach makes our method suitable for the PTQ scheme. We give a theoretical analysis for the said bound and use it to construct a knowledge distillation loss that guides the optimization to focus on the more sensitive layers and samples. In addition, we leverage the Hessian upper bound to improve the weight quantization parameters selection by focusing on the more sensitive elements in the weight tensors. Empirically, by employing EPTQ we achieve state-of-the-art results on various models, tasks, and datasets, including ImageNet classification, COCO object detection, and Pascal-VOC for semantic segmentation.

公開日:2024-09-26
翻訳日:2024-11-09 14:28:50

# 局所周期駆動を用いた光学格子の個別可変トンネル係数

Individually tunable tunnelling coefficients in optical lattices using local periodic driving ( http://arxiv.org/abs/2309.12124v2 )

ライセンス: Link先を確認

Georgia M. Nixon, F. Nur Unal, Ulrich Schneider,

(参考訳) 光格子中の超低温原子は、翻訳不変系の強力な量子シミュレータとして登場し、eg \ 強相関系および位相系に多くの応用がある。しかしながら、すべてのハミルトンパラメータを局所的にチューニングする能力は、より広い範囲の量子現象のシミュレーションを可能にする、優れた目標のままである。量子ガス顕微鏡と光ツイーザの最近の進歩により、光格子内の個々のトンネルリンクに対する局所的な制御は、局所的な時間周期ポテンシャルを組み込むことで、どのように達成できるかを理論的に示す。本研究では,各格子の現場エネルギーを周期的に変調し,Floquet理論を用いて1次元のトンネル振幅を個別に制御する方法を実証する。興味あるトポロジモデル(例えば拡張Su-Schrieffer-Heegerモデル)を実現するための様々な例を提供する。 2次元に拡張すると、リーブ格子の局所周期運転は、完全に制御可能なトンネル等級を持つ2次元ネットワークを設計する。 3サイト・プラケットでは, 相対的なトンネル振幅とゲージ不変フラックスを同時に同時に制御し, 完全にプログラム可能な2次元強結合モデルを構築するための明確なステップストーンを提供する。また、2次元の磁場勾配を生成するために、我々の技術をどのように活用するかを明確に示す。この局所変調スキームは多くの異なる格子幾何学に適用できる。

Ultracold atoms in optical lattices have emerged as powerful quantum simulators of translationally invariant systems with many applications in e.g.\ strongly-correlated and topological systems. However, the ability to locally tune all Hamiltonian parameters remains an outstanding goal that would enable the simulation of a wider range of quantum phenomena. Motivated by recent advances in quantum gas microscopes and optical tweezers, we here show theoretically how local control over individual tunnelling links in an optical lattice can be achieved by incorporating local time-periodic potentials. We propose to periodically modulate the on-site energy of individual lattice sites and employ Floquet theory to demonstrate how this provides full individual control over the tunnelling amplitudes in one dimension. We provide various example configurations realising interesting topological models such as extended Su-Schrieffer-Heeger models that would be challenging to realise by other means. Extending to two dimensions, we demonstrate that local periodic driving in a Lieb lattice engineers a 2D network with fully controllable tunnelling magnitudes. In a three-site plaquette, we show full simultaneous control over the relative tunnelling amplitudes and the gauge-invariant flux piercing the plaquette, providing a clear stepping stone to building a fully programmable 2D tight-binding model. We also explicitly demonstrate how utilise our technique to generate a magnetic field gradient in 2D. This local modulation scheme is applicable to many different lattice geometries.

公開日:2024-09-24
翻訳日:2024-11-09 14:28:50

# ウィグナーの友情シナリオと非古典的因果適合性, モノガミー関係, 微調整との関係

Relating Wigner's Friend scenarios to Nonclassical Causal Compatibility, Monogamy Relations, and Fine Tuning ( http://arxiv.org/abs/2309.12987v3 )

ライセンス: Link先を確認

Yìlè Yīng, Marina Maciel Ansanelli, Andrea Di Biagio, Elie Wolfe, Eric Gama Cavalcanti,

(参考訳) 非古典的因果モデリングは、相対論的因果構造と忠実性に固執しつつ、ベルの不平等の違反を説明するために開発された。近年、ベルの定理より強いと見なせるノーゴー定理が導出され、ウィグナーの友人の思考実験であるローカルフレンドリー(LF)のノーゴー定理の拡張に基づいている。ここでは、LFのノーゴー定理は、非古典的あるいは循環的因果的説明が考慮されたとしても、因果的モデリングの分野において重大な課題をもたらすことを示す。我々はまず、統計的境界問題から生じる単ガミー関係の特別な場合として、LFノゴー定理の重要な要素の一つであるLF不等式をリキャストした。さらに,不等式を非古典的因果補間問題から生じる因果補間不等式として,よく動機付けられた因果補間仮定によって示唆される因果構造について再検討した。この因果構造からLF不等式が現れるのは、一般に確率論やさらにエキゾチックな理論のように、観測された事象の潜伏原因が量子後記述を許容する場合であってもである。さらに、非古典的因果モデルでは、No Fine-Tuning原則に違反することなくLF不平等の違反を説明できないことを証明している。最後に、循環因果モデルに訴えてもこれらの障害は克服できないことに留意し、因果モデリングフレームワークのさらなる拡張の可能性について論じる。

Nonclassical causal modeling was developed in order to explain violations of Bell inequalities while adhering to relativistic causal structure and faithfulness -- that is, avoiding fine-tuned causal explanations. Recently, a no-go theorem that can be viewed as being stronger than Bell's theorem has been derived, based on extensions of the Wigner's friend thought experiment: the Local Friendliness (LF) no-go theorem. Here we show that the LF no-go theorem poses formidable challenges for the field of causal modeling, even when nonclassical and/or cyclic causal explanations are considered. We first recast the LF inequalities, one of the key elements of the LF no-go theorem, as special cases of monogamy relations stemming from a statistical marginal problem. We then further recast LF inequalities as causal compatibility inequalities stemming from a nonclassical causal marginal problem, for a causal structure implied by well-motivated causal-metaphysical assumptions. We find that the LF inequalities emerge from this causal structure even when one allows the latent causes of observed events to admit post-quantum descriptions, such as in a generalized probabilistic theory or in an even more exotic theory. We further prove that no nonclassical causal model can explain violations of LF inequalities without violating the No Fine-Tuning principle. Finally, we note that these obstacles cannot be overcome even if one appeals to cyclic causal models, and we discuss potential directions for further extensions of the causal modeling framework.

公開日:2024-09-25
翻訳日:2024-11-09 14:28:50

# ウィグナーの友人シナリオと非古典的因果適合性, モノガミー関係, 微調整との関連性

Relating Wigner's Friend Scenarios to Nonclassical Causal Compatibility, Monogamy Relations, and Fine Tuning ( http://arxiv.org/abs/2309.12987v4 )

ライセンス: Link先を確認

Yìlè Yīng, Marina Maciel Ansanelli, Andrea Di Biagio, Elie Wolfe, David Schmid, Eric Gama Cavalcanti,

公開日:2024-09-25
翻訳日:2024-11-09 14:28:50

# アルゴリズム採用における公正性とバイアス--多分野調査

Fairness and Bias in Algorithmic Hiring: a Multidisciplinary Survey ( http://arxiv.org/abs/2309.13933v3 )

ライセンス: Link先を確認

Alessandro Fabris, Nina Baranowska, Matthew J. Dennis, David Graus, Philipp Hacker, Jorge Saldivar, Frederik Zuiderveen Borgesius, Asia J. Biega,

(参考訳) 雇用者は採用パイプライン全体を通してアルゴリズムによる雇用技術を採用しています。アルゴリズム的公正性は、高い利害関係と構造的不等式のため、この領域で特に適用できる。残念ながら、この分野のほとんどの研究は部分的な扱いを提供しており、しばしば2つの競合する物語によって制約される。アルゴリズムによる雇用のバイアスが減り、社会に利益をもたらすかどうか、そしてさらに重要なことは、信頼感の低下に対して、現在のローテクな代替手段は未解決のままだ。この多分野にわたる調査は、システム、バイアス、尺度、緩和戦略、データセット、およびアルゴリズム雇用と公正性の法的側面のバランスよく統合されたカバレッジを持つ実践者や研究者に向けられている。私たちの仕事は、現在の機会と制限を強調し、すべての利害関係者に対する共有メリットを保証するために、将来の作業に対する推奨を提供することによって、この技術のコンテキスト化された理解とガバナンスを支援します。

Employers are adopting algorithmic hiring technology throughout the recruitment pipeline. Algorithmic fairness is especially applicable in this domain due to its high stakes and structural inequalities. Unfortunately, most work in this space provides partial treatment, often constrained by two competing narratives, optimistically focused on replacing biased recruiter decisions or pessimistically pointing to the automation of discrimination. Whether, and more importantly what types of, algorithmic hiring can be less biased and more beneficial to society than low-tech alternatives currently remains unanswered, to the detriment of trustworthiness. This multidisciplinary survey caters to practitioners and researchers with a balanced and integrated coverage of systems, biases, measures, mitigation strategies, datasets, and legal aspects of algorithmic hiring and fairness. Our work supports a contextualized understanding and governance of this technology by highlighting current opportunities and limitations, providing recommendations for future work to ensure shared benefits for all stakeholders.

公開日:2024-09-24
翻訳日:2024-11-09 14:28:50

# 不可逆性としての誤差と外乱:統一定義、ウィグナー-アーナキ-ヤナーゼ理論および時間外相関器

Error and Disturbance as Irreversibility with Applications: Unified Definition, Wigner--Araki--Yanase Theorem and Out-of-Time-Order Correlator ( http://arxiv.org/abs/2309.14172v2 )

ライセンス: Link先を確認

Haruki Emori, Hiroyasu Tajima,

(参考訳) ハイゼンベルクの不確実性原理の提案以来、量子測定の誤りと乱れは量子物理学の基本的な概念となっている。量子物理学において物理量を定義する場合と同様に、これらの2つの概念を定義する単一の方法はなく、多くの独立した定義が与えられている。ここでは、量子過程における不可逆性の特別な場合として、誤差と乱れを定義する新しい定式化を確立する。この定式化により、確率的熱力学と量子情報理論における不可逆性の知識を量子測定の誤差と乱れに適用することができる。この強さを示すために、我々は3つの副産物を提供する: まず、既存の誤りと乱れの定式化を統一する。第二に、量的ウィグナー・アラキ・ヤナーゼ定理(保存法に基づく測定実施に関する普遍的な制限)を任意の定義やプロセスの誤りや乱れに拡張する。第三に、我々の定式化は、量子多体系における量子カオスの尺度であるアウト・オブ・タイム・オーダード・コレレータ(out-of-time-orderd-correlator)を、測定コンテキストと類似の不可逆性としてカバーし、その実験的評価方法を提供する。

Since the proposal of Heisenberg's uncertainty principle, error and disturbance of quantum measurements have been fundamental notions in quantum physics. As is often the case when defining physical quantities in quantum physics, there is no single way to define these two notions, and many independent definitions of them have been given. Here, we establish a novel formulation defining the error and disturbance as special cases of the irreversibility in quantum processes. The formulation enables us to apply the knowledge of irreversibility in stochastic thermodynamics and quantum information theory to the error and disturbance in quantum measurements. To demonstrate this strength, we provide three byproducts: First, we unify the existing formulations of error and disturbance. Second, we extend the quantitative Wigner--Araki--Yanase theorem -- a universal restriction on measurement implementation under a conservation law -- to errors and disturbances of arbitrary definitions and processes. Third, we reveal that our formulation covers the out-of-time-orderd-correlator -- a measure of quantum chaos in a quantum many-body system -- as the irreversibility in analogy with the measurement context, and provide its experimental evaluation method.

公開日:2024-09-27
翻訳日:2024-11-09 14:28:50

# Informative Manifold Projection を用いたクラスタ探索

Cluster Exploration using Informative Manifold Projections ( http://arxiv.org/abs/2309.14857v3 )

ライセンス: Link先を確認

Stavros Gerolymatos, Xenophon Evangelopoulos, Vladimir Gusev, John Y. Goulermas,

(参考訳) 次元性低減(DR)は、高次元データの視覚的な探索と、2次元または3次元空間におけるクラスタ構造を明らかにするための重要なツールの1つである。文献におけるDR手法の大部分は、実践者が検討中のデータセットに関する事前知識を考慮に入れていない。本稿では,従来の知識の異なる構造を抽出するだけでなく,その基盤となる構造を明らかにすることを目的とした,情報埋め込みを生成する新しい手法を提案する。これを実現するために,まず,先行情報に関連付けられた構造を縮小するコントラストPCAと,得られた埋め込みにおいて有意なデータ分離を保証するクルトーシス投影探索という2つの目的を線形に組み合わせた。本稿では,この課題を多様体最適化問題として定式化し,3種類の事前知識を考慮に入れた多種多様なデータセットを経験的に検証する。最後に,高次元データの反復的視覚探索を行うためのフレームワークを提供する。

Dimensionality reduction (DR) is one of the key tools for the visual exploration of high-dimensional data and uncovering its cluster structure in two- or three-dimensional spaces. The vast majority of DR methods in the literature do not take into account any prior knowledge a practitioner may have regarding the dataset under consideration. We propose a novel method to generate informative embeddings which not only factor out the structure associated with different kinds of prior knowledge but also aim to reveal any remaining underlying structure. To achieve this, we employ a linear combination of two objectives: firstly, contrastive PCA that discounts the structure associated with the prior information, and secondly, kurtosis projection pursuit which ensures meaningful data separation in the obtained embeddings. We formulate this task as a manifold optimization problem and validate it empirically across a variety of datasets considering three distinct types of prior knowledge. Lastly, we provide an automated framework to perform iterative visual exploration of high-dimensional data.

公開日:2024-09-27
翻訳日:2024-11-09 14:28:50

# Can-SAVE:生存分析変数とHRによる大量がんリスク予測

Can-SAVE: Mass Cancer Risk Prediction via Survival Analysis Variables and EHR ( http://arxiv.org/abs/2309.15039v2 )

ライセンス: Link先を確認

Petr Philonenko, Vladimir Kokh, Pavel Blinov,

(参考訳) 特定のがんスクリーニング法は、しばしば費用がかかり、時間がかかり、大規模に適用できる。高度な人工知能(AI)法は、がんの検出に大いに役立つが、特定のまたは深い医療データを必要とする。これらの側面は、がんスクリーニング法の大量実装を妨げる。そのため、既存のElectronic Health Records(EHR)ボリュームに基づいて、がんリスクの大量パーソナライズされた評価にAI手法を適用することは、医療にとって破壊的な変化である。本稿では,Can-SAVE癌リスク評価手法を提案する。アクセス性が高く、資源効率が良く、一連の高レベルの医療イベントのみを利用する。提案手法をロシア国内1100万人以上の住民と4つの地域を対象とした長期的ふりかえり実験で検証した。 Can-SAVE法は平均精度22.8%$\pm$2.7%対15.1%$\pm$2.6%の基準値を大きく上回る。広範囲にわたるアブレーション試験により,提案手法の優位性が確認された。腫瘍学者が監督する実験では、1000人中84人のがん患者が確実に検出されることが示された。これらの結果は, 経時的に要する年齢差が1000例中9例に留まっている(大腸癌の場合)。以上の結果から,従来の医療リスク評価手法に比べて癌検出率(TOP@1k)は4.7-6.4倍向上した。

Specific medical cancer screening methods are often costly, time-consuming, and weakly applicable on a large scale. Advanced Artificial Intelligence (AI) methods greatly help cancer detection but require specific or deep medical data. These aspects prevent the mass implementation of cancer screening methods. For this reason, it is a disruptive change for healthcare to apply AI methods for mass personalized assessment of the cancer risk among patients based on the existing Electronic Health Records (EHR) volume. This paper presents a novel Can-SAVE cancer risk assessment method combining a survival analysis approach with a gradient-boosting algorithm. It is highly accessible and resource-efficient, utilizing only a sequence of high-level medical events. We tested the proposed method in a long-term retrospective experiment covering more than 1.1 million people and four regions of Russia. The Can-SAVE method significantly exceeds the baselines by the Average Precision metric of 22.8%$\pm$2.7% vs 15.1%$\pm$2.6%. The extensive ablation study also confirmed the proposed method's dominant performance. The experiment supervised by oncologists shows a reliable cancer patient detection rate of up to 84 out of 1000 selected. Such results surpass the medical screening strategies estimates; the typical age-specific Number Needed to Screen is only 9 out of 1000 (for colorectal cancer). Overall, our experiments show a 4.7-6.4 times improvement in cancer detection rate (TOP@1k) compared to the traditional healthcare risk estimation approach.

公開日:2024-09-27
翻訳日:2024-11-09 10:12:15

# 機械学習のためのハミングウェイト保存量子回路の訓練性と表現性

Trainability and Expressivity of Hamming-Weight Preserving Quantum Circuits for Machine Learning ( http://arxiv.org/abs/2309.15547v2 )

ライセンス: Link先を確認

Léo Monbroussou, Eliott Z. Mamon, Jonas Landman, Alex B. Grilo, Romain Kukla, Elham Kashefi,

(参考訳) 量子機械学習(QML)は、量子コンピュータの現実的な応用にとって有望な分野となっているが、短期的手法とその拡張性は依然として重要な研究トピックである。この文脈では、変動量子回路(VQC)を保存した特定のハミング重みのトレーナビリティと制御性について分析する。これらの回路は、ヒルベルト空間の部分空間を保存するクォービットゲートを使用し、固定ハミング重み$k$の基底状態で区切られている。本研究では、まず、新しいヒューリスティックなデータローダの実現可能性を示し、$n$-qubit量子回路をトレーニングすることにより、$\binom{n}{k}$-dimensionalベクトルの量子振幅符号化を行う。これらのデータローダは、QFIM(Quantum Fisher Information Matrix)のランクをチェックし、次元削減技術を用いて得られる。第2に、任意の VQC 状態の QFIM のランクがほぼどこでも一定であり、これは別の関心事であるという事実を理論的に正当化する。最後に、ハミング重み保存回路のトレーニング可能性を分析し、その部分空間の次元$\binom{n}{k}$に応じて、$l_2$コスト関数勾配のばらつきが有界であることを示す。このことは、これらの回路に対するバレンプラトーの存在/欠如の条件を証明し、近年の制御可能性と変分量子回路のトレーニング可能性の関係に関する予想が適用されない状況を強調している。

Quantum machine learning (QML) has become a promising area for real world applications of quantum computers, but near-term methods and their scalability are still important research topics. In this context, we analyze the trainability and controllability of specific Hamming weight preserving variational quantum circuits (VQCs). These circuits use qubit gates that preserve subspaces of the Hilbert space, spanned by basis states with fixed Hamming weight $k$. In this work, we first design and prove the feasibility of new heuristic data loaders, performing quantum amplitude encoding of $\binom{n}{k}$-dimensional vectors by training an $n$-qubit quantum circuit. These data loaders are obtained using dimensionality reduction techniques, by checking the Quantum Fisher Information Matrix (QFIM)'s rank. Second, we provide a theoretical justification for the fact that the rank of the QFIM of any VQC state is almost-everywhere constant, which is of separate interest. Lastly, we analyze the trainability of Hamming weight preserving circuits, and show that the variance of the $l_2$ cost function gradient is bounded according to the dimension $\binom{n}{k}$ of the subspace. This proves conditions of existence/lack of Barren Plateaus for these circuits, and highlights a setting where a recent conjecture on the link between controllability and trainability of variational quantum circuits does not apply.

公開日:2024-09-26
翻訳日:2024-11-09 10:12:15

# ミューオン崩壊における相対論的絡み合い

Relativistic entanglement in muon decay ( http://arxiv.org/abs/2309.15863v2 )

ライセンス: Link先を確認

S. Carneiro, F. C. Sobrinho,

(参考訳) 非折り畳み相互作用の存在下での量子絡みの時間進化について論じる。特に、磁場中におけるミューオン崩壊生成物の絡み合いを再考する。これは角運動量保存の結果であり、ブルックヘイブンとフェルミラブの実験によって報告されたものと正確な一致で測定されたミューオンg因子の異常をもたらす。

We discuss the time evolution of quantum entanglement in presence of non-collapsing interactions. In particular, the entanglement between the products of a muon decay in a magnetic field is revisited. It results from angular momentum conservation and leads to an anomaly in the measured muon g factor in precise agreement with that reported by the Brookhaven and Fermilab experiments.

公開日:2024-09-26
翻訳日:2024-11-09 10:12:15

# 支援を受けるための学習: 介入を意識した概念埋め込みモデル

Learning to Receive Help: Intervention-Aware Concept Embedding Models ( http://arxiv.org/abs/2309.16928v3 )

ライセンス: Link先を確認

Mateo Espinosa Zarlenga, Katherine M. Collins, Krishnamurthy Dvijotham, Adrian Weller, Zohreh Shams, Mateja Jamnik,

(参考訳) 概念ボトルネックモデル (Concept Bottleneck Models, CBM) は、高レベルの概念セットを使用して予測を構築し、説明することによって、ニューラルネットワークの不透明さに対処する。これらのモデルの特別な特性は、ユーザーが誤予測された概念を修正でき、それによってモデルの性能が向上する、概念の介入を許すことである。しかし、最近の研究は、介入効果は概念が介入される順序やモデルのアーキテクチャやハイパーパラメーターの訓練に大きく依存することを示した。これは、モデルが概念的介入に適切に受容されるための、CBMの列車時のインセンティブの欠如に起因している、と我々は主張する。そこで我々は,新しいCBMアーキテクチャとトレーニングパラダイムであるIntervention-Aware Concept Embedding Model (IntCEMs)を提案する。我々のモデルは、列車の時間に意味のある介入経路をサンプリングできるエンド・ツー・エンド方式の概念介入ポリシーを学習する。この条件では、IntCEMは、テスト時にデプロイされたコンセプトの介入を効果的に選択し、受け取ります。実験の結果,IntCEMはテスト時間の概念介入を施す場合,最先端の概念解釈モデルよりも優れており,本手法の有効性が示された。

Concept Bottleneck Models (CBMs) tackle the opacity of neural architectures by constructing and explaining their predictions using a set of high-level concepts. A special property of these models is that they permit concept interventions, wherein users can correct mispredicted concepts and thus improve the model's performance. Recent work, however, has shown that intervention efficacy can be highly dependent on the order in which concepts are intervened on and on the model's architecture and training hyperparameters. We argue that this is rooted in a CBM's lack of train-time incentives for the model to be appropriately receptive to concept interventions. To address this, we propose Intervention-aware Concept Embedding models (IntCEMs), a novel CBM-based architecture and training paradigm that improves a model's receptiveness to test-time interventions. Our model learns a concept intervention policy in an end-to-end fashion from where it can sample meaningful intervention trajectories at train-time. This conditions IntCEMs to effectively select and receive concept interventions when deployed at test-time. Our experiments show that IntCEMs significantly outperform state-of-the-art concept-interpretable models when provided with test-time concept interventions, demonstrating the effectiveness of our approach.

公開日:2024-09-26
翻訳日:2024-11-09 10:12:15

# 評価指標としての大規模言語モデルにおける認知バイアスのベンチマーク

Benchmarking Cognitive Biases in Large Language Models as Evaluators ( http://arxiv.org/abs/2309.17012v3 )

ライセンス: Link先を確認

Ryan Koo, Minhwa Lee, Vipul Raheja, Jong Inn Park, Zae Myung Kim, Dongyeop Kang,

(参考訳) 大規模言語モデルは認知的に偏見のある裁判官である。大規模言語モデル(LLM)は、最近、簡単なプロンプトと文脈内学習を備えた自動評価器として有効であることが示されている。本研究では,4つの異なるサイズ範囲の15個のLLMを組み立て,システムスターがシステムスクエアよりも優れているような評価器として,他のLLMからの優先順位付けによる出力応答の評価を行う。次に、LCM評価出力の6つの異なる認知バイアスを測定するベンチマークであるCoBBLEr(CoBBLEr)として、LCMの認知バイアスベンチマークを導入したランキングアウトプットの品質を評価する。 LLMはテキスト品質評価器であり、評価器としての頑健性に疑問を呈する評価のそれぞれにおいて、バイアスベンチマーク(すべてのモデルで比較される平均40%)に強い指標を示す。さらに,人間と機械の嗜好の相関について検討し,平均ランクバイアスオーバーラップ(RBO)スコアを49.6%と算出し,機械選好が人間と不一致であることを示唆した。以上の結果から,LLMは人間の嗜好に沿った自動アノテーションには利用できない可能性が示唆された。私たちのプロジェクトページは以下の通りです。

Large Language Models are cognitively biased judges. Large Language Models (LLMs) have recently been shown to be effective as automatic evaluators with simple prompting and in-context learning. In this work, we assemble 15 LLMs of four different size ranges and evaluate their output responses by preference ranking from the other LLMs as evaluators, such as System Star is better than System Square. We then evaluate the quality of ranking outputs introducing the Cognitive Bias Benchmark for LLMs as Evaluators (CoBBLEr), a benchmark to measure six different cognitive biases in LLM evaluation outputs, such as the Egocentric bias where a model prefers to rank its own outputs highly in evaluation. We find that LLMs are biased text quality evaluators, exhibiting strong indications on our bias benchmark (average of 40% of comparisons across all models) within each of their evaluations that question their robustness as evaluators. Furthermore, we examine the correlation between human and machine preferences and calculate the average Rank-Biased Overlap (RBO) score to be 49.6%, indicating that machine preferences are misaligned with humans. According to our findings, LLMs may still be unable to be utilized for automatic annotation aligned with human preferences. Our project page is at: https://minnesotanlp.github.io/cobbler.

公開日:2024-09-25
翻訳日:2024-11-09 10:12:15

# 9歳の子どもたちは感情でChatGPTを上回り-中国語の文章から

Nine-year-old children outperformed ChatGPT in emotion: Evidence from Chinese writing ( http://arxiv.org/abs/2310.00578v2 )

ライセンス: Link先を確認

Siyi Cao, Yizhong Xu, Tongquan Zhou, Siruo Zhou,

(参考訳) 近年の研究では、ChatGPTは複雑な人間のようなテキストを生成する能力を持つことが実証されており、心的タスクの理論におけるその性能は、9歳の子供に匹敵するものであることが確認されている。しかし、ChatGPTが中国語の筆記能力で9歳の子供を上回っているかどうかは不明である。そこで本研究では,ChatGPTと9歳児のナラティブと科学の両面から,ChatGPTの相対的な強みと弱さを明らかにすることを目的として,中国語の筆記能力について検討した。収集したデータは、流布度、精度、複雑さ、凝集度、感情の5つの言語次元で分析された。各次元は正確な指標によって評価された。以上の結果から,9歳児は書字の流布度や結束度においてChatGPT以上に優れていた。一方,ChatGPTは,子どもに比べて精度が優れていた。複雑性に関して、子どもたちは科学をテーマとした執筆において優れたスキルを示し、一方でChatGPTは自然をテーマとした執筆において優位に立った。この研究は、中国の作文において、9歳の子供がChatGPTよりも強い感情を伝えることを明らかにする先駆的な研究である。

ChatGPT has been demonstrated to possess significant capabilities in generating intricate, human-like text, and recent studies have established that its performance in theory of mind tasks is comparable to that of a nine-year-old child. However, it remains uncertain whether ChatGPT surpasses nine-year-old children in Chinese writing proficiency. To explore this, our study juxtaposed the Chinese writing performance of ChatGPT and nine-year-old children on both narrative and scientific topics, aiming to uncover the relative strengths and weaknesses of ChatGPT in writing. The collected data were analyzed across five linguistic dimensions: fluency, accuracy, complexity, cohesion, and emotion. Each dimension underwent assessment through precise indices. The findings revealed that nine-year-old children excelled beyond ChatGPT in terms of fluency and cohesion within their writing. In contrast, ChatGPT manifested a superior performance in accuracy compared to the children. Concerning complexity, children exhibited superior skills in science-themed writing, while ChatGPT prevailed in nature-themed writing. Significantly, this research is pioneering in revealing that nine-year-old children convey stronger emotions than ChatGPT in their Chinese compositions.

公開日:2024-09-24
翻訳日:2024-11-09 10:12:15

# 大規模言語モデル生成データのソース属性

Source Attribution for Large Language Model-Generated Data ( http://arxiv.org/abs/2310.00646v2 )

ライセンス: Link先を確認

Jingtan Wang, Xinyang Lu, Zitong Zhao, Zhongxiang Dai, Chuan-Sheng Foo, See-Kiong Ng, Bryan Kian Hsiang Low,

(参考訳) LLM(Large Language Models)の印象的なパフォーマンスと商業化の可能性は、トレーニングデータの知的財産権(IP)に対する深刻な懸念を引き起こしている。特に、LLMによって生成された合成テキストは、LLMのトレーニングに使用されるデータのIPを侵害する可能性がある。この目的のために、LLMによる合成テキストの生成に寄与したデータ提供者を特定することにより、ソース属性を実現できることが不可欠である。そこで本稿では,LLMが電子透かしを組み込んだ合成テキストを作成できるようにし,電子透かしによってこの問題に対処できることを述べる。このようなウォーターマーキングフレームワーク(例えば、ソース属性の精度、敵に対するロバスト性)の鍵となる特性を特定し、アルゴリズム設計によりこれらの重要な特性を満たすソース属性フレームワークを提案する。我々のフレームワークは,LLMが生成したテキストからデータ提供者への正確なマッピングを学習することを可能にする。大規模な実証実験により,本フレームワークが効果的な情報源属性を達成できることが示唆された。

The impressive performances of Large Language Models (LLMs) and their immense potential for commercialization have given rise to serious concerns over the Intellectual Property (IP) of their training data. In particular, the synthetic texts generated by LLMs may infringe the IP of the data being used to train the LLMs. To this end, it is imperative to be able to perform source attribution by identifying the data provider who contributed to the generation of a synthetic text by an LLM. In this paper, we show that this problem can be tackled by watermarking, i.e., by enabling an LLM to generate synthetic texts with embedded watermarks that contain information about their source(s). We identify the key properties of such watermarking frameworks (e.g., source attribution accuracy, robustness against adversaries), and propose a source attribution framework that satisfies these key properties due to our algorithmic designs. Our framework enables an LLM to learn an accurate mapping from the generated texts to data providers, which sets the foundation for effective source attribution. Extensive empirical evaluations show that our framework achieves effective source attribution.

公開日:2024-09-25
翻訳日:2024-11-09 10:12:15

# すべてのデータセット数:ジョイントデータセットトレーニングによる単眼3Dオブジェクト検出のスケールアップ

Every Dataset Counts: Scaling up Monocular 3D Object Detection with Joint Datasets Training ( http://arxiv.org/abs/2310.00920v4 )

ライセンス: Link先を確認

Fulong Ma, Xiaoyang Yan, Guoyang Zhao, Xiaojie Xu, Yuxuan Liu, Jun Ma, Ming Liu,

(参考訳) モノクロ3D物体検出は、自律運転において重要な役割を果たす。しかし、既存のモノクル3D検出アルゴリズムは、LiDAR測定から派生した3Dラベルに依存している。具体的には,多種多様な3次元および2次元データセットを用いたモノクロ3次元物体検出モデルの学習パイプラインについて検討した。提案フレームワークは,(1)様々なカメラ設定にまたがって機能するロバストなモノクル3Dモデル,(2)異なるクラスアノテーションでデータセットを適応するための選択学習戦略,(3)2Dラベルを用いた擬似3Dトレーニング手法により,2Dラベルのみを含むシーンにおける検出性能を向上させる。このフレームワークにより、様々なオープンな3D/2Dデータセットのジョイントセット上でモデルをトレーニングし、より強力な一般化能力を持つモデルと、2Dラベルのみを持つ新しいデータセットの性能を向上させることができる。我々はKITTI/nuScenes/ONCE/Cityscapes/BDD100Kデータセットに関する広範な実験を行い、提案手法のスケーリング能力を実証した。

Monocular 3D object detection plays a crucial role in autonomous driving. However, existing monocular 3D detection algorithms depend on 3D labels derived from LiDAR measurements, which are costly to acquire for new datasets and challenging to deploy in novel environments. Specifically, this study investigates the pipeline for training a monocular 3D object detection model on a diverse collection of 3D and 2D datasets. The proposed framework comprises three components: (1) a robust monocular 3D model capable of functioning across various camera settings, (2) a selective-training strategy to accommodate datasets with differing class annotations, and (3) a pseudo 3D training approach using 2D labels to enhance detection performance in scenes containing only 2D labels. With this framework, we could train models on a joint set of various open 3D/2D datasets to obtain models with significantly stronger generalization capability and enhanced performance on new dataset with only 2D labels. We conduct extensive experiments on KITTI/nuScenes/ONCE/Cityscapes/BDD100K datasets to demonstrate the scaling ability of the proposed method.

公開日:2024-09-24
翻訳日:2024-11-09 10:12:15

# 誘引子ダイナミクスによる離散的、構成的、象徴的表現

Discrete, compositional, and symbolic representations through attractor dynamics ( http://arxiv.org/abs/2310.01807v2 )

ライセンス: Link先を確認

Andrew Nam, Eric Elmoznino, Nikolay Malkin, James McClelland, Yoshua Bengio, Guillaume Lajoie,

(参考訳) シンボリックシステムは、人間の推論と行動の多くの側面に根ざしたルールと関係をカプセル化するので、認知過程をモデル化するための強力なフレームワークである。これらのモデルの中心は、体系性、構成性、生産性であり、認知科学と人工知能の両方において貴重である。しかし、いくつかの制限が残っている。例えば、構造化された記号過程と潜在サブシンボル過程の統合は、量子化やソフトマックスサンプリングのようなフィアット手法によって計算レベルで実装されている。そこで本研究では,思考の確率的言語(PLoT)に似た認知過程をモデル化するために,アトラクタダイナミクスを記号表現と統合した新しいニューラル確率力学系モデルを提案する。我々のモデルは、連続表現空間を、事前定義されたプリミティブに頼るのではなく、教師なし学習を通じて、記号系の意味性と構成性の特徴を反映する、記号列に対応する引き付け状態を持つ離散盆地に分割する。さらに、PLoTと同様に、入力データとシンボルエンコーディングの相互情報を反映したアトラクタ状態の多種多様な分布のサンプルを学習する。このアプローチは、認知操作の複雑な双対性を反映したより包括的なモデルを提供する、AIで表現力の証明された神経弁別可能な基質であるニューラルダイナミクスを通じて、シンボル処理とサブシンボル処理の両方を統合する統一的なフレームワークを確立する。

Symbolic systems are powerful frameworks for modeling cognitive processes as they encapsulate the rules and relationships fundamental to many aspects of human reasoning and behavior. Central to these models are systematicity, compositionality, and productivity, making them invaluable in both cognitive science and artificial intelligence. However, certain limitations remain. For instance, the integration of structured symbolic processes and latent sub-symbolic processes has been implemented at the computational level through fiat methods such as quantization or softmax sampling, which assume, rather than derive, the operations underpinning discretization and symbolicization. In this work, we introduce a novel neural stochastic dynamical systems model that integrates attractor dynamics with symbolic representations to model cognitive processes akin to the probabilistic language of thought (PLoT). Our model segments the continuous representational space into discrete basins, with attractor states corresponding to symbolic sequences, that reflect the semanticity and compositionality characteristic of symbolic systems through unsupervised learning, rather than relying on pre-defined primitives. Moreover, like PLoT, our model learns to sample a diverse distribution of attractor states that reflect the mutual information between the input data and the symbolic encodings. This approach establishes a unified framework that integrates both symbolic and sub-symbolic processing through neural dynamics, a neuro-plausible substrate with proven expressivity in AI, offering a more comprehensive model that mirrors the complex duality of cognitive operations.

公開日:2024-09-26
翻訳日:2024-11-09 10:12:15

# 会話型健康エージェント:パーソナライズされたLDM駆動エージェントフレームワーク

Conversational Health Agents: A Personalized LLM-Powered Agent Framework ( http://arxiv.org/abs/2310.02374v5 )

ライセンス: Link先を確認

Mahyar Abbasian, Iman Azimi, Amir M. Rahmani, Ramesh Jain,

(参考訳) 会話型健康エージェント(英: Conversational Health Agents、CHA)は、援助や診断などの医療サービスを提供する対話型システムである。現在のCHA、特にLLM(Large Language Models)を利用するものは、主に会話の側面に焦点を当てています。しかし、彼らは限られたエージェント機能を提供し、特にマルチステップの問題解決、パーソナライズされた会話、マルチモーダルデータ分析を欠いている。私たちの目標はこれらの制限を克服することです。我々は,対話エージェントがユーザの医療クエリに対してパーソナライズされた応答を生成するために,オープンソースのLLMフレームワークであるopenCHAを提案する。このフレームワークにより、開発者はデータソース、知識ベース、分析モデルを含む外部ソースをLLMベースのソリューションに統合できる。 openCHAには、外部ソースからの情報を収集するためのアクションを計画し実行するためのオーケストレータが含まれている。知識獲得、問題解決機能、多言語とマルチモーダルの会話を促進し、さまざまなAIプラットフォームとのインタラクションを促進する。 2つのデモと4つのユースケースを通じて、複雑なヘルスケアタスクを扱うためのフレームワークの能力について説明する。さらに、GitHubを通じてコミュニティが利用可能なオープンソースとしてopenCHAをリリースしています。

Conversational Health Agents (CHAs) are interactive systems that provide healthcare services, such as assistance and diagnosis. Current CHAs, especially those utilizing Large Language Models (LLMs), primarily focus on conversation aspects. However, they offer limited agent capabilities, specifically lacking multi-step problem-solving, personalized conversations, and multimodal data analysis. Our aim is to overcome these limitations. We propose openCHA, an open-source LLM-powered framework, to empower conversational agents to generate a personalized response for users' healthcare queries. This framework enables developers to integrate external sources including data sources, knowledge bases, and analysis models, into their LLM-based solutions. openCHA includes an orchestrator to plan and execute actions for gathering information from external sources, essential for formulating responses to user inquiries. It facilitates knowledge acquisition, problem-solving capabilities, multilingual and multimodal conversations, and fosters interaction with various AI platforms. We illustrate the framework's proficiency in handling complex healthcare tasks via two demonstrations and four use cases. Moreover, we release openCHA as open source available to the community via GitHub.

公開日:2024-09-25
翻訳日:2024-11-09 10:12:15

# 原子アンサンブルにおける同時スピンスクイーズと光スクイーズ

Concurrent spin squeezing and light squeezing in an atomic ensemble ( http://arxiv.org/abs/2310.02493v2 )

ライセンス: Link先を確認

Shenchao Jin, Junlei Duan, Youwei Zhang, Xichang Zhang, Han Bao, Heng Shen, Liantuan Xiao, Suotang Jia, Mingfeng Wang, Yanhong Xiao,

(参考訳) スクイーズスピン状態とスクイーズ光は量子力学と量子情報科学の鍵となる資源であるが、これまでの実験では別々に研究されてきた。この2つのタイプの量子状態の同時生成は興味深いが、依然として挑戦的な目標である。本稿では, 偏光相互作用に基づく新しいプロトコルを提案し, 0.61\pm0.09~\mathrm{dB}$および0.65^{+0.11}_{-0.10}〜\mathrm{dB}$の同時スピンスクイーズと, 熱原子アンサンブルにおける光スクイーズを同時に行う実験結果について報告する。スクイーズ過程は決定論的であり、光場と集合原子スピンの両方に対して固定されたスクイーズ方向を与える。さらに、圧縮光モードは1つの空間モードの多重周波数側バンドに配置される。この新しいタイプの二重圧縮状態は、量子強化量子論と量子ネットワークに適用できる。我々の方法は、光学、低温原子、閉じ込められたイオンなどの他の量子プラットフォームに拡張することができる。

Squeezed spin states and squeezed light are both key resources for quantum metrology and quantum information science, but have been separately investigated in experiments so far. Simultaneous generation of these two types of quantum states in one experiment setup is intriguing but remains a challenging goal. Here we propose a novel protocol based on judiciously engineered symmetric atom-light interaction, and report proof-of-principle experimental results of concurrent spin squeezing of $0.61\pm0.09~\mathrm{dB}$ and light squeezing of $0.65^{+0.11}_{-0.10}~\mathrm{dB}$ in a hot atomic ensemble. The squeezing process is deterministic, yielding fixed squeezing directions for both the light field and the collective atomic spin. Furthermore, the squeezed light modes lie in the multiple frequency sidebands of a single spatial mode. This new type of dual squeezed state is applicable for quantum enhanced metrology and quantum networks. Our method can be extended to other quantum platforms such as optomechanics, cold atom and trapped ions.

公開日:2024-09-24
翻訳日:2024-11-09 10:12:15

# 物理インフォームドニューラルネットワークを用いた多相流中遠心ポンプの学習特性パラメータとダイナミクス

Learning characteristic parameters and dynamics of centrifugal pumps under multiphase flow using physics-informed neural networks ( http://arxiv.org/abs/2310.03001v2 )

ライセンス: Link先を確認

Felipe de Castro Teixeira Carvalho, Kamaljyoti Nath, Alberto Luiz Serpa, George Em Karniadakis,

(参考訳) 電気式潜水ポンプ(ESP)は、石油・ガス産業において人工揚力システムとして広く利用されている。これらのポンプは、炭化水素、水、堆積物の複雑な混合物からなる多相流に頻繁に遭遇する。このような混合物はエマルションの形成につながり、個々の相とは異なる有効粘性によって特徴づけられる。これらの条件を評価するために使用される従来の多相流量計は、高い運用コストと劣化に対する感受性によって負担される。そこで本研究では,ESPシステムの流体特性,動的状態,重要なパラメータを間接的に推定する物理インフォームドニューラルネットワーク(PINN)モデルを提案する。ポンプからの吸気・吐出圧力測定を用いて, 確実に推定できるパラメータのサブセットについて, 包括的構造的, 実用的識別可能性分析を行った。 PINNモデルの有効性は,これらの圧力測定を入力データとして,未知の状態とパラメータを推定することによって検証した。さらに, 各種含水シナリオのシミュレーションデータと実験データを用いて, PINNモデルの性能を粒子フィルタ法と比較した。比較分析の結果, PINNモデルは従来の多相流速計の代替として有望な可能性を秘めており, 運用効率の向上とESPアプリケーションのコスト削減に期待できる道筋となっている。

Electrical submersible pumps (ESPs) are prevalently utilized as artificial lift systems in the oil and gas industry. These pumps frequently encounter multiphase flows comprising a complex mixture of hydrocarbons, water, and sediments. Such mixtures lead to the formation of emulsions, characterized by an effective viscosity distinct from that of the individual phases. Traditional multiphase flow meters, employed to assess these conditions, are burdened by high operational costs and susceptibility to degradation. To this end, this study introduces a physics-informed neural network (PINN) model designed to indirectly estimate the fluid properties, dynamic states, and crucial parameters of an ESP system. A comprehensive structural and practical identifiability analysis was performed to delineate the subset of parameters that can be reliably estimated through the use of intake and discharge pressure measurements from the pump. The efficacy of the PINN model was validated by estimating the unknown states and parameters using these pressure measurements as input data. Furthermore, the performance of the PINN model was benchmarked against the particle filter method utilizing both simulated and experimental data across varying water content scenarios. The comparative analysis suggests that the PINN model holds significant potential as a viable alternative to conventional multiphase flow meters, offering a promising avenue for enhancing operational efficiency and reducing costs in ESP applications.

公開日:2024-09-23
翻訳日:2024-11-09 10:12:15

# 構造対応レコメンデーションインベディング進化のためのグラフ付最適化器

Graph-enhanced Optimizers for Structure-aware Recommendation Embedding Evolution ( http://arxiv.org/abs/2310.03032v3 )

ライセンス: Link先を確認

Cong Xu, Jun Wang, Jianyong Wang, Wei Zhang,

(参考訳) 埋め込みは、現実世界の実体の仮想表現であり、その後の意思決定モデルの基礎であるため、現代のレコメンデーションシステムにおいて重要な役割を果たす。本稿では,新しい組込み更新機構であるSEvo(Structure-aware Embedding Evolution)を提案する。通常、中間モジュールとして機能するGNN(Graph Neural Network)とは異なり、SEvoはトレーニング中に最小の計算オーバーヘッドでグラフ構造情報を埋め込みに直接注入することができる。 SEvoの収束特性とその潜在的な変種は、設計の有効性を正当化するために理論的に解析される。さらに、SEvoは最先端のパフォーマンスのために既存のオプティマイザにシームレスに統合できる。特に、モーメント推定補正を施したSevo強化AdamWは、モデルとデータセットの範囲で一貫した改善を示し、明示的なGNNモジュールを超えてグラフ構造情報を効果的に活用する新たな技術経路を示唆している。

Embedding plays a key role in modern recommender systems because they are virtual representations of real-world entities and the foundation for subsequent decision-making models. In this paper, we propose a novel embedding update mechanism, Structure-aware Embedding Evolution (SEvo for short), to encourage related nodes to evolve similarly at each step. Unlike GNN (Graph Neural Network) that typically serves as an intermediate module, SEvo is able to directly inject graph structural information into embedding with minimal computational overhead during training. The convergence properties of SEvo along with its potential variants are theoretically analyzed to justify the validity of the designs. Moreover, SEvo can be seamlessly integrated into existing optimizers for state-of-the-art performance. Particularly SEvo-enhanced AdamW with moment estimate correction demonstrates consistent improvements across a spectrum of models and datasets, suggesting a novel technical route to effectively utilize graph structural information beyond explicit GNN modules.

公開日:2024-09-27
翻訳日:2024-11-09 10:12:15

# 非滑らか弱凸有限和結合合成最適化

Non-Smooth Weakly-Convex Finite-sum Coupled Compositional Optimization ( http://arxiv.org/abs/2310.03234v5 )

ライセンス: Link先を確認

Quanqi Hu, Dixian Zhu, Tianbao Yang,

(参考訳) 本稿では,新しい合成最適化問題である$\underline{\bf n}$on-$\underline{\bf s}$mooth $\underline{\bf w}$eakly-$\underline{\bf c}$onvex $\underline{\bf f}$inite-sum $\underline{\bf c}$oupled $\underline{\bf c}$ompositional $\underline{\bf o}$ptimization (NSWC FCCO)について検討する。機械学習とAIの幅広い応用と、経験的リスク最小化に基づく確率的アルゴリズムの欠点に対処する能力により、FCCOへの関心が高まっている。しかし、FCCOの最近の研究は、内部関数と外部関数の両方が滑らかであり、より多様な問題に取り組む可能性を制限すると仮定している。本研究は, 外部関数が弱凸で非減少し, 内関数が弱凸である非滑らかなFCCOを調べることにより, この領域を拡大する。単ループアルゴリズムを解析し、目的関数のモロー展開の$\epsilon$-stationary点を求める複雑性を確立する。さらに,3つの関数のネスト配置を特徴とする,非滑らかな弱凸三値有限サム結合合成最適化問題にもアルゴリズムを拡張した。最後に,2方向部分AUC最大化と多方向部分AUC最大化のためのディープラーニングへのアルゴリズムの適用について,実験的検討を用いて検討した。

This paper investigates new families of compositional optimization problems, called $\underline{\bf n}$on-$\underline{\bf s}$mooth $\underline{\bf w}$eakly-$\underline{\bf c}$onvex $\underline{\bf f}$inite-sum $\underline{\bf c}$oupled $\underline{\bf c}$ompositional $\underline{\bf o}$ptimization (NSWC FCCO). There has been a growing interest in FCCO due to its wide-ranging applications in machine learning and AI, as well as its ability to address the shortcomings of stochastic algorithms based on empirical risk minimization. However, current research on FCCO presumes that both the inner and outer functions are smooth, limiting their potential to tackle a more diverse set of problems. Our research expands on this area by examining non-smooth weakly-convex FCCO, where the outer function is weakly convex and non-decreasing, and the inner function is weakly-convex. We analyze a single-loop algorithm and establish its complexity for finding an $\epsilon$-stationary point of the Moreau envelop of the objective function. Additionally, we also extend the algorithm to solving novel non-smooth weakly-convex tri-level finite-sum coupled compositional optimization problems, which feature a nested arrangement of three functions. Lastly, we explore the applications of our algorithms in deep learning for two-way partial AUC maximization and multi-instance two-way partial AUC maximization, using empirical studies to showcase the effectiveness of the proposed algorithms.

公開日:2024-09-24
翻訳日:2024-11-09 10:12:15

# リレーショナル・コンボリューションによる階層的関係表現の学習

Learning Hierarchical Relational Representations through Relational Convolutions ( http://arxiv.org/abs/2310.03240v3 )

ライセンス: Link先を確認

Awni Altabaa, John Lafferty,

(参考訳) ディープラーニングの研究分野は、関係的特徴表現の学習を支援するアーキテクチャと帰納的バイアスの研究である。本稿では,階層的関係の表現を学習する上での課題,すなわちオブジェクト群間の高次関係パターンについて述べる。本稿では,単純なモジュールを構成することで,より複雑な関係性を段階的に捉える計算機構を備えたニューラルネットワークである「リレーショナル畳み込みネットワーク」を紹介する。このフレームワークの重要なコンポーネントは、グラフレットフィルタを結合することで、オブジェクトのグループ内のリレーショナルパターンをキャプチャする新しい操作である。関係的畳み込みを構成することは、高次の階層的関係の表現を学ぶ深いアーキテクチャをもたらす。アーキテクチャのモチベーションと詳細、およびリレーショナル畳み込みネットワークが階層構造を持つリレーショナルタスクをモデル化するための効果的なフレームワークを提供するための一連の実験を示す。

An evolving area of research in deep learning is the study of architectures and inductive biases that support the learning of relational feature representations. In this paper, we address the challenge of learning representations of hierarchical relations--that is, higher-order relational patterns among groups of objects. We introduce "relational convolutional networks", a neural architecture equipped with computational mechanisms that capture progressively more complex relational features through the composition of simple modules. A key component of this framework is a novel operation that captures relational patterns in groups of objects by convolving graphlet filters--learnable templates of relational patterns--against subsets of the input. Composing relational convolutions gives rise to a deep architecture that learns representations of higher-order, hierarchical relations. We present the motivation and details of the architecture, together with a set of experiments to demonstrate how relational convolutional networks can provide an effective framework for modeling relational tasks that have hierarchical structure.

公開日:2024-09-26
翻訳日:2024-11-09 10:12:15

# データ依存結合を持つ確率補間子

Stochastic interpolants with data-dependent couplings ( http://arxiv.org/abs/2310.03725v3 )

ライセンス: Link先を確認

Michael S. Albergo, Mark Goldstein, Nicholas M. Boffi, Rajesh Ranganath, Eric Vanden-Eijnden,

(参考訳) フローや拡散のような測度の動的輸送にインスパイアされた生成モデルは、2つの確率密度の間の連続時間マップを構築する。従来、これらのうちの1つはターゲット密度であり、サンプルを通してのみアクセス可能であり、もう1つはデータに依存しない単純な基底密度と見なされている。本研究では,確率的補間子の枠組みを用いて,ベースとターゲット密度の \textit{couple} を定式化する。そこで,ベースからのサンプルを,クラスラベルや連続埋め込みに関する情報を組み込んだ(ただし妨げない)方法で,ターゲットからのサンプルを条件付きで計算する。これにより、条件付き生成モデルとして機能する動的トランスポートマップを構築することができる。これらのトランスポートマップは、標準的な独立な設定に類似した単純な2乗損失回帰問題を解くことで学習可能であることを示す。超高分解能および in-painting の実験を通じて, 実際に依存結合を構築することの有用性を実証する。

Generative models inspired by dynamical transport of measure -- such as flows and diffusions -- construct a continuous-time map between two probability densities. Conventionally, one of these is the target density, only accessible through samples, while the other is taken as a simple base density that is data-agnostic. In this work, using the framework of stochastic interpolants, we formalize how to \textit{couple} the base and the target densities, whereby samples from the base are computed conditionally given samples from the target in a way that is different from (but does preclude) incorporating information about class labels or continuous embeddings. This enables us to construct dynamical transport maps that serve as conditional generative models. We show that these transport maps can be learned by solving a simple square loss regression problem analogous to the standard independent setting. We demonstrate the usefulness of constructing dependent couplings in practice through experiments in super-resolution and in-painting.

公開日:2024-09-23
翻訳日:2024-11-09 10:12:15

PDF登録状況（最新200件）