Fugu-MT 論文翻訳(概要): Are Hard Examples also Harder to Explain? A Study with Human and Model-Generated Explanations

論文の概要: Are Hard Examples also Harder to Explain? A Study with Human and Model-Generated Explanations

arxiv url: http://arxiv.org/abs/2211.07517v1
Date: Mon, 14 Nov 2022 16:46:14 GMT
ステータス: 翻訳完了
システム内更新日: 2022-11-15 16:01:50.239099
Title: Are Hard Examples also Harder to Explain? A Study with Human and Model-Generated Explanations
Title（参考訳）: 難しい例も説明が難しいのか? 人間とモデルによる説明に関する研究
Authors: Swarnadeep Saha, Peter Hase, Nazneen Rajani, Mohit Bansal
Abstract要約: 説明可能性とサンプル硬度の関係について検討する。我々は人による説明と GPT-3 による説明との比較を行った。また、文脈内例の難易度が GPT-3 の説明の質に影響を及ぼすことも判明した。
参考スコア（独自算出の注目度）: 82.12092864529605
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent work on explainable NLP has shown that few-shot prompting can enable large pretrained language models (LLMs) to generate grammatical and factual natural language explanations for data labels. In this work, we study the connection between explainability and sample hardness by investigating the following research question - "Are LLMs and humans equally good at explaining data labels for both easy and hard samples?" We answer this question by first collecting human-written explanations in the form of generalizable commonsense rules on the task of Winograd Schema Challenge (Winogrande dataset). We compare these explanations with those generated by GPT-3 while varying the hardness of the test samples as well as the in-context samples. We observe that (1) GPT-3 explanations are as grammatical as human explanations regardless of the hardness of the test samples, (2) for easy examples, GPT-3 generates highly supportive explanations but human explanations are more generalizable, and (3) for hard examples, human explanations are significantly better than GPT-3 explanations both in terms of label-supportiveness and generalizability judgements. We also find that hardness of the in-context examples impacts the quality of GPT-3 explanations. Finally, we show that the supportiveness and generalizability aspects of human explanations are also impacted by sample hardness, although by a much smaller margin than models. Supporting code and data are available at https://github.com/swarnaHub/ExplanationHardness
Abstract（参考訳）: 説明可能なNLPに関する最近の研究は、少数ショットプロンプトにより、大規模事前訓練された言語モデル(LLM)がデータラベルの文法的および事実的自然言語説明を生成することができることを示した。本研究は,「LLMと人間は,簡単かつ硬いサンプルの両方にデータラベルを説明するのが得意なのか?」という質問に対して,説明可能性とサンプル硬さの関係について検討する。まず、winograd schema challenge (winogrande dataset) のタスクについて、一般化可能なcommonsenseルールの形で、人間が記述した説明を収集することで、この質問に答える。これらの説明をGPT-3で生成されたものと比較し,テストサンプルの硬さとコンテキスト内サンプルの硬さを変化させる。その結果,(1) GPT-3の説明は, 試験試料の硬さに関わらず, 人間の説明と同じくらい文法的であり, (2) 簡単な例では, GPT-3は高い支持的説明を生成するが, 人間の説明はより一般化可能であり, 3) 難解な例では, ラベル支持性および一般化可能性判定の両方の観点からも, GPT-3の説明よりもはるかに優れていることがわかった。また、文脈内例の硬さがGPT-3説明の質に影響を及ぼすことも見出した。最後に、人間の説明の支持性と一般化性も、モデルよりもはるかに小さなマージンで、サンプル硬さの影響を受けていることを示す。コードとデータはhttps://github.com/swarnaHub/ExplanationHardnessで入手できる。

関連論文リスト

Scenarios and Approaches for Situated Natural Language Explanations [18.022428746019582]
ベンチマークデータセットである条件ベース説明を収集します。このデータセットには100の説明書が含まれている。オーディエンスと組み合わせたエクスラナンダム(explanandum paired with a audience)"の各状況について、人間による説明を含める。本稿では,ルールベースのプロンプト,メタプロンプト,コンテキスト内学習プロンプトの3つのカテゴリについて検討する。
論文参考訳（メタデータ） (2024-06-07T15:56:32Z)
Verifying Relational Explanations: A Probabilistic Approach [2.113770213797994]
我々は,GNNExplainerによる説明の不確実性を評価する手法を開発した。説明において不確実性を定量化する因子グラフモデルを学習する。いくつかのデータセットで得られた結果は,GNNExplainerによる説明の検証に有効であることを示している。
論文参考訳（メタデータ） (2024-01-05T08:14:51Z)
Towards More Faithful Natural Language Explanation Using Multi-Level Contrastive Learning in VQA [7.141288053123662]
視覚的質問応答(VQA-NLE)における自然言語の説明は,ブラックボックスシステムに対するユーザの信頼を高めるために,自然言語文を生成することによって,モデルの意思決定プロセスを説明することを目的としている。既存のポストホックな説明は、人間の論理的推論と常に一致している訳ではなく、1) 誘惑的不満足な説明は、生成した説明が論理的に答えに繋がらないこと、2) 現実的不整合性、2) 画像上の事実を考慮せずに解答の反事実的説明を偽示すること、3) 意味的摂動の過敏性、モデルは、小さな摂動によって引き起こされる意味的変化を認識できないこと、である。
論文参考訳（メタデータ） (2023-12-21T05:51:55Z)
ExaRanker: Explanation-Augmented Neural Ranker [67.4894325619275]
本研究は,ニューラルランサーが説明の恩恵を受けることを示す。我々は、GPT-3.5のようなLCMを用いて、説明付き検索データセットを増強する。 ExaRankerと呼ばれる私たちのモデルは、数千の例で微調整され、合成説明は、説明なしで3倍の例で微調整されたモデルと同等に実行される。
論文参考訳（メタデータ） (2023-01-25T11:03:04Z)
The Unreliability of Explanations in Few-Shot In-Context Learning [50.77996380021221]
我々は、テキスト上の推論、すなわち質問応答と自然言語推論を含む2つのNLPタスクに焦点を当てる。入力と論理的に整合した説明は、通常より正確な予測を示す。本稿では,説明の信頼性に基づいてモデル予測を校正する枠組みを提案する。
論文参考訳（メタデータ） (2022-05-06T17:57:58Z)
Reframing Human-AI Collaboration for Generating Free-Text Explanations [46.29832336779188]
少数の人間が記述した例を用いて,自由テキストの説明を生成する作業について考察する。クラウドソースによる説明よりも,GPT-3による説明の方が好まれる。我々は、GPT-3と教師付きフィルタを組み合わせたパイプラインを作成し、二項受理性判定を介し、ループ内人間を組み込む。
論文参考訳（メタデータ） (2021-12-16T07:31:37Z)
Prompting Contrastive Explanations for Commonsense Reasoning Tasks [74.7346558082693]
大規模事前学習言語モデル(PLM)は、常識推論タスクにおいて、ほぼ人間に近い性能を達成することができる。人間の解釈可能な証拠を生成するために、同じモデルを使う方法を示す。
論文参考訳（メタデータ） (2021-06-12T17:06:13Z)
Parameterized Explainer for Graph Neural Network [49.79917262156429]
グラフニューラルネットワーク(GNN)のためのパラメータ化説明器PGExplainerを提案する。既存の研究と比較すると、PGExplainerはより優れた一般化能力を持ち、インダクティブな設定で容易に利用することができる。合成データセットと実生活データセットの両方の実験では、グラフ分類の説明に関するAUCの相対的な改善が24.7%まで高い競争性能を示した。
論文参考訳（メタデータ） (2020-11-09T17:15:03Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。