Fugu-MT 論文翻訳(概要): CREAK: A Dataset for Commonsense Reasoning over Entity Knowledge

論文の概要: CREAK: A Dataset for Commonsense Reasoning over Entity Knowledge

arxiv url: http://arxiv.org/abs/2109.01653v1
Date: Fri, 3 Sep 2021 17:56:40 GMT
ステータス: 翻訳完了
システム内更新日: 2021-09-06 14:07:13.515863
Title: CREAK: A Dataset for Commonsense Reasoning over Entity Knowledge
Title（参考訳）: CREAK: エンティティ知識に関する常識推論のためのデータセット
Authors: Yasumasa Onoe, Michael J.Q. Zhang, Eunsol Choi, Greg Durrett
Abstract要約: エンティティ知識に関するコモンセンス推論のためのテストベッドであるCREAKを紹介する。私たちのデータセットは、真か偽かのエンティティに関する13万の人間によるイングランドの主張で構成されています。クラウドワーカーはこれらのステートメントを簡単に見つけ出すことができ、データセット上での人間のパフォーマンスは高い。
参考スコア（独自算出の注目度）: 32.61883349110328
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Most benchmark datasets targeting commonsense reasoning focus on everyday scenarios: physical knowledge like knowing that you could fill a cup under a waterfall [Talmor et al., 2019], social knowledge like bumping into someone is awkward [Sap et al., 2019], and other generic situations. However, there is a rich space of commonsense inferences anchored to knowledge about specific entities: for example, deciding the truthfulness of a claim "Harry Potter can teach classes on how to fly on a broomstick." Can models learn to combine entity knowledge with commonsense reasoning in this fashion? We introduce CREAK, a testbed for commonsense reasoning about entity knowledge, bridging fact-checking about entities (Harry Potter is a wizard and is skilled at riding a broomstick) with commonsense inferences (if you're good at a skill you can teach others how to do it). Our dataset consists of 13k human-authored English claims about entities that are either true or false, in addition to a small contrast set. Crowdworkers can easily come up with these statements and human performance on the dataset is high (high 90s); we argue that models should be able to blend entity knowledge and commonsense reasoning to do well here. In our experiments, we focus on the closed-book setting and observe that a baseline model finetuned on existing fact verification benchmark struggles on CREAK. Training a model on CREAK improves accuracy by a substantial margin, but still falls short of human performance. Our benchmark provides a unique probe into natural language understanding models, testing both its ability to retrieve facts (e.g., who teaches at the University of Chicago?) and unstated commonsense knowledge (e.g., butlers do not yell at guests).
Abstract（参考訳）: ウォーターフォール(talmor et al., 2019)の下でカップを満たすことができることを知るような物理的な知識、誰かにぶつかるといった社会的な知識は厄介です [sap et al., 2019]、その他の一般的な状況。しかし、特定の実体に関する知識に固定されたコモンセンス推論の豊富な空間がある:例えば、主張の真理性を決定する:「ハリー・ポッターは、ほうきで飛ぶ方法を教えることができる」。モデルは、エンティティ知識とコモンセンス推論をこの方法で組み合わせることを学ぶことができるか? 私たちは、エンティティ知識に関するコモンセンス推論のためのテストベッドであるCREAKを紹介します。エンティティに関する事実チェック(Harry Potterはウィザードであり、ほうきに乗るのに熟練しています)とコモンセンス推論(スキルが得意なら、他の人にその方法を教えることができます。私たちのデータセットは、小さなコントラストセットに加えて、真または偽のエンティティに関する13万の人間によるイングランドの主張で構成されています。クラウドワーカーはこれらのステートメントを簡単に見つけ出すことができ、データセット上での人的パフォーマンスは高い(高い90s)。実験では,クローズドブックの設定に注目し,既存の事実検証ベンチマークに基づくベースラインモデルがCREAKで苦労していることを確認する。 CREAKのモデルのトレーニングは精度をかなりのマージンで向上させるが、それでも人間のパフォーマンスには欠ける。私たちのベンチマークは、自然言語理解モデルに関するユニークな調査を提供し、事実を検索する能力(例えば、シカゴ大学で教える人など)をテストする。そして、無言の常識の知識(例えば、バトラーは客に叫ばない)。

論文の概要: CREAK: A Dataset for Commonsense Reasoning over Entity Knowledge

関連論文リスト