Fugu-MT 論文翻訳(概要): Ask Me Anything: A simple strategy for prompting language models

論文の概要: Ask Me Anything: A simple strategy for prompting language models

arxiv url: http://arxiv.org/abs/2210.02441v2
Date: Thu, 6 Oct 2022 06:39:56 GMT
ステータス: 翻訳完了
システム内更新日: 2022-10-07 15:00:31.649788
Title: Ask Me Anything: A simple strategy for prompting language models
Title（参考訳）: Ask Me Anything: 言語モデルを促進するためのシンプルな戦略
Authors: Simran Arora, Avanika Narayan, Mayee F. Chen, Laurel Orr, Neel Guha, Kush Bhatia, Ines Chami, Frederic Sala, Christopher R\'e
Abstract要約: 大規模言語モデル(LLM)は、単に自然言語のプロンプトを与えられただけである。そこで本研究では,質問応答(QA)のプロンプトが,モデル出力を制限するプロンプトよりも優れていることを示す。収集したプロンプトを適用して、入力の真のラベルに対していくつかのノイズの多い投票を行う。プロンプトは、非常に異なる精度と複雑な依存関係を持つことができる。
参考スコア（独自算出の注目度）: 24.294416731247427
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Large language models (LLMs) transfer well to new tasks out-of-the-box simply given a natural language prompt that demonstrates how to perform the task and no additional training. Prompting is a brittle process wherein small modifications to the prompt can cause large variations in the model predictions, and therefore significant effort is dedicated towards designing a painstakingly "perfect prompt" for a task. To mitigate the high degree of effort involved in prompt-design, we instead ask whether producing multiple effective, yet imperfect, prompts and aggregating them can lead to a high quality prompting strategy. Our observations motivate our proposed prompting method, ASK ME ANYTHING (AMA). We first develop an understanding of the effective prompt formats, finding that question-answering (QA) prompts, which encourage open-ended generation ("Who went to the park?") tend to outperform those that restrict the model outputs ("John went to the park. Output True or False."). Our approach recursively uses the LLM itself to transform task inputs to the effective QA format. We apply the collected prompts to obtain several noisy votes for the input's true label. We find that the prompts can have very different accuracies and complex dependencies and thus propose to use weak supervision, a procedure for combining the noisy predictions, to produce the final predictions for the inputs. We evaluate AMA across open-source model families (e.g., EleutherAI, BLOOM, OPT, and T0) and model sizes (125M-175B parameters), demonstrating an average performance lift of 10.2% over the few-shot baseline. This simple strategy enables the open-source GPT-J-6B model to match and exceed the performance of few-shot GPT3-175B on 15 of 20 popular benchmarks. Averaged across these tasks, the GPT-Neo-6B model outperforms few-shot GPT3-175B. We release our code here: https://github.com/HazyResearch/ama_prompting
Abstract（参考訳）: 大規模言語モデル(LLM)は、単にタスクの実行方法を示す自然言語プロンプトを与えられただけで、追加のトレーニングは行われない。プロンプティング(英: Prompting)は、プロンプトへの小さな変更がモデル予測に大きなバリエーションを引き起こすという不安定なプロセスである。プロンプトデザインに関わる高い労力を軽減するために、複数の効果的だが不完全なプロンプトを作成し、それらを集約することで高品質なプロンプト戦略につながるかどうかを問う。我々の観測は提案手法であるASK ME ANYTHING (AMA) を動機付けている。まず、有効なプロンプト形式を理解し、QA(Qanguage-Awering)プロンプトがオープンエンド生成を促進する(Who went the park?)ことを発見し、モデル出力を制限するプロンプトよりも優れている(John went the park. Output True or False.)。提案手法では,LLM自体を用いてタスク入力を効率的なQA形式に変換する。収集したプロンプトを適用して、入力の真のラベルに対していくつかのノイズの多い投票を行う。提案手法では,プロンプトの精度と複雑な依存性が全く異なる場合があり,ノイズ予測を組み合わせるための弱い監督手法を用いて入力の最終的な予測を行う。オープンソースモデルファミリ(例えばEleutherAI, BLOOM, OPT, T0)とモデルサイズ(125M-175Bパラメータ)でAMAを評価し, 少数のベースラインに対して平均10.2%の性能向上を示した。この単純な戦略により、オープンソースのGPT-J-6Bモデルは、20のベンチマークのうち15のベンチマークにおいて、数ショットのGPT3-175Bの性能にマッチし、超えることができる。 GPT-Neo-6BはGPT3-175Bより優れている。コードはここでリリースします。 https://github.com/hazyresearch/ama_prompting

論文の概要: Ask Me Anything: A simple strategy for prompting language models

関連論文リスト