Fugu-MT 論文翻訳(概要): Are We Falling in a Middle-Intelligence Trap? An Analysis and Mitigation of the Reversal Curse

論文の概要: Are We Falling in a Middle-Intelligence Trap? An Analysis and Mitigation of the Reversal Curse

arxiv url: http://arxiv.org/abs/2311.07468v1
Date: Mon, 13 Nov 2023 17:01:12 GMT
ステータス: 翻訳完了
システム内更新日: 2023-11-14 13:35:48.420309
Title: Are We Falling in a Middle-Intelligence Trap? An Analysis and Mitigation of the Reversal Curse
Title（参考訳）: ミドルインテリジェンストラップに落ちていくのか? 逆呪いの分析と緩和
Authors: Ang Lv and Kaiyi Zhang and Shufang Xie and Quan Tu and Yuhan Chen and Ji-Rong Wen and Rui Yan
Abstract要約: 近年の研究では「逆の呪い」として知られる大規模言語モデル(LLM)の現象が強調されている。我々は、逆の呪いは特定のモデルの訓練目的の結果であると主張している。本稿では, 逆の呪いを軽減するために, BICO (BIdirectional Casual Language Modeling Optimization) を提案する。
参考スコア（独自算出の注目度）: 73.65112477688353
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent studies have highlighted a phenomenon in large language models (LLMs) known as "the reversal curse," in which the order of knowledge entities in the training data biases the models' comprehension. For example, if a model is trained on sentences where entity A consistently appears before entity B, it can respond to queries about A by providing B. However, it may encounter confusion when presented with questions concerning B. We contend that the reversal curse is partially a result of specific model training objectives, particularly evident in the prevalent use of the next-token prediction within most causal language models. For the next-token prediction, models solely focus on a token's preceding context, resulting in a restricted comprehension of the input. In contrast, we illustrate that the GLM, trained using the autoregressive blank infilling objective where tokens to be predicted have access to the entire context, exhibits better resilience against the reversal curse. We propose a novel training method, BIdirectional Casual language modeling Optimization (BICO), designed to mitigate the reversal curse when fine-tuning pretrained causal language models on new data. BICO modifies the causal attention mechanism to function bidirectionally and employs a mask denoising optimization. In the task designed to assess the reversal curse, our approach improves Llama's accuracy from the original 0% to around 70%. We hope that more attention can be focused on exploring and addressing these inherent weaknesses of the current LLMs, in order to achieve a higher level of intelligence.
Abstract（参考訳）: 近年の研究では、トレーニングデータにおける知識エンティティの順序がモデルの理解をバイアスする「反転の呪い」として知られる大規模言語モデル(llm)の現象が強調されている。例えば、エンティティaがエンティティbの前に一貫して現れる文でモデルが訓練された場合、bを提供することによってaに関する問い合わせに応答することができる。しかし、bに関する質問が提示されると混乱することがある。次世代の予測では、モデルはトークンの前のコンテキストのみに焦点を当て、入力の限定的な理解をもたらす。対照的に、予測されるトークンがコンテキスト全体にアクセス可能な自己回帰的空白埋め込み目標を用いて訓練されたGLMは、逆の呪いに対してより良いレジリエンスを示す。本稿では,新しいデータに基づく因果言語モデルに微調整を施した際の逆の呪いを軽減するために,BICO(BIdirectional Casual Language Modeling Optimization)を提案する。 bicoは因果注意機構を双方向に機能するように修正し、マスク同期最適化を採用している。逆の呪いを評価するために設計されたタスクでは、Llamaの精度を元の0%から約70%に向上させる。我々は、より高いレベルの知性を達成するために、現在のllmの持つ本質的な弱点を探求し、対処することに集中できることを望んでいる。

関連論文リスト

Provably Learning from Language Feedback [22.620909858951197]
我々は、言語フィードバック(LLF)問題から学習を形式化し、潜在報酬にもかかわらず学習を可能にする十分な仮定を主張する。伝達エリューダ次元は、フィードバック中の情報がLLF問題の学習複雑性を変えるという直感を捉えていることを示す。我々は、逐次的相互作用によってLLF問題を確実に解決する、$textttHELiX$と呼ばれる非回帰アルゴリズムを開発した。
論文参考訳（メタデータ） (2025-06-12T04:35:02Z)
Enough Coin Flips Can Make LLMs Act Bayesian [71.79085204454039]
大規模言語モデル(LLMs)は、入力プロンプトで与えられた少数ショットの例を一般化する能力を示しており、これはICL(In-context Learning)として知られる創発的能力である。我々は,ILCを用いてベイズフレームワークと整合性のある構造的推論を行うか,パターンマッチングに依存するかを検討する。
論文参考訳（メタデータ） (2025-03-06T18:59:23Z)
Reasoning Robustness of LLMs to Adversarial Typographical Errors [49.99118660264703]
大規模言語モデル(LLM)は、Chain-of-Thought(CoT)プロンプトを使用した推論において、印象的な機能を示している。本研究では,LLMのタイポグラフィ的誤りに対するロバスト性について検討する。我々は,クエリに重要な単語の型を反復的にサンプリングし,攻撃に成功しそうな編集を選択する,Adversarial Typo Attack(texttATA$)アルゴリズムを設計する。
論文参考訳（メタデータ） (2024-11-08T05:54:05Z)
Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval [55.63711219190506]
大きな言語モデル(LLM)は、しばしば適切な検索クエリのポーズに苦労する。私たちは$underlineLe$arningを$underlineRe$trieveに$underlineT$rying (LeReT)を導入します。 LeReTは、絶対精度を最大29%向上し、下流ジェネレータの評価を17%向上させることができる。
論文参考訳（メタデータ） (2024-10-30T17:02:54Z)
Utilize the Flow before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction Tuning [68.57166425493283]
Refusal-Aware Instruction Tuning (RAIT) により、Large Language Models (LLM) は未知の質問に答えることを拒否できる。 RAITは、初期LCMの応答の正しさに基づいてトレーニングサンプルを変更する。この粗末なアプローチは、LLMが正しく答えられる可能性のある質問に答えることを過剰に拒否する可能性がある。
論文参考訳（メタデータ） (2024-10-09T14:12:51Z)
Towards a Theoretical Understanding of the 'Reversal Curse' via Training Dynamics [45.69328374321502]
自動回帰型大言語モデル(LLM)は、多くの複雑な推論タスクを解くのに優れた能力を示す。 LLM は、2つの文が意味的に同一であっても、推論中に '$B get A$' と結論付けることができない。 2つの自己回帰モデルに対する勾配降下のトレーニング力学を用いて、理論的に逆の呪いを解析する。
論文参考訳（メタデータ） (2024-05-07T21:03:51Z)
Mitigating Reversal Curse in Large Language Models via Semantic-aware Permutation Training [57.771940716189114]
我々は、大きな言語モデル(LLM)が「逆の呪い」に苦しむことを示す。逆の呪いの根本原因は、訓練と推論の段階で異なる単語順にある。この問題に対処するために,SPT(Semantic-Aware Permutation Training)を提案する。
論文参考訳（メタデータ） (2024-03-01T18:55:20Z)
Deficiency of Large Language Models in Finance: An Empirical Examination of Hallucination [7.627664978437055]
幻覚は、大きな言語モデル(LLM)の根本的な欠如として認識されている本稿では, LLMモデルにおける金融概念と用語の説明能力について実験的に検討する。本研究では,少数ショット学習,コントラスト層(DoLa)による復号化,検索拡張生成(RAG)手法,クエリコマンドを生成する関数のプロンプトベースのツール学習方法など,4つの実践的手法の有効性を評価する。
論文参考訳（メタデータ） (2023-11-27T05:27:13Z)
LLMRefine: Pinpointing and Refining Large Language Models via Fine-Grained Actionable Feedback [65.84061725174269]
最近の大規模言語モデル(LLM)は、世代品質を改善するために人間のフィードバックを活用している。 LLMの出力を最適化する推論時間最適化手法であるLLMRefineを提案する。機械翻訳、長文質問応答(QA)、話題要約を含む3つのテキスト生成タスクについて実験を行った。 LLMRefineは、すべてのベースラインアプローチを一貫して上回り、翻訳タスクの1.7 MetricXポイント、ASQAの8.1 ROUGE-L、トピックの要約の2.2 ROUGE-Lの改善を実現している。
論文参考訳（メタデータ） (2023-11-15T19:52:11Z)
Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis [103.89753784762445]
大規模言語モデル(LLM)は多言語機械翻訳(MMT)の処理において顕著な可能性を示した。本稿では, MMT における LLM の利点と課題を体系的に検討する。また,ChatGPTとGPT-4を含む8つのLLMを徹底的に評価した。
論文参考訳（メタデータ） (2023-04-10T15:51:30Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。