Recent advances regarding question answering and reading comprehension have
resulted in models that surpass human performance when the answer is contained
in a single, continuous passage of text, requiring only single-hop reasoning.
However, in actual scenarios, lots of complex queries require multi-hop
reasoning. The key to the Question Answering task is semantic feature
interaction between documents and questions, which is widely processed by
Bi-directional Attention Flow (Bi-DAF), but Bi-DAF generally captures only the
surface semantics of words in complex questions and fails to capture implied
semantic feature of intermediate answers. As a result, Bi-DAF partially ignores
part of the contexts related to the question and cannot extract the most
important parts of multiple documents. In this paper we propose a new model
architecture for multi-hop question answering, by applying two completion
strategies: (1) Coarse-Grain complex question Decomposition (CGDe) strategy are
introduced to decompose complex question into simple ones under the condition
of without any additional annotations (2) Fine-Grained Interaction (FGIn)
strategy are introduced to better represent each word in the document and
extract more comprehensive and accurate sentences related to the inference
path. The above two strategies are combined and tested on the SQuAD and
HotpotQA datasets, and the experimental results show that our method
outperforms state-of-the-art baselines.
1 School of Electronic and Information, Beijing Jiaotong University,
北京江東大学電子情報科1校
0.49
Beijing 100044, China
北京100044年中国
0.67
(18111006, liuyun) @bjtu.edu.cn
(18111006,liuyun)@bj tu.edu.cn
0.73
2 Key Laboratory of Communication and Information Systems, Beijing Municipal Commission of Education,
2 北京市教育委員会通信情報システム重要な研究室
0.56
Beijing 100044, China
北京100044年中国
0.67
Abstract Recent advances regarding question answering and reading comprehension have resulted in models that surpass human performance when the answer is contained in a single, continuous passage of text, requiring only single-hop reasoning.
However, in actual scenarios, lots of complex queries require multihop reasoning.
しかし、実際のシナリオでは、多くの複雑なクエリはマルチホップ推論を必要とする。
0.61
The key to the Question Answering task is semantic feature interaction between documents and questions, which is widely processed by Bi-directional Attention Flow (Bi-DAF), but BiDAF generally captures only the surface semantics of words in complex questions, and fails to capture implied semantic feature of intermediate answers.
In this paper we propose a new model architecture for multi-hop question answering, by applying two completion strategies: (1) Coarse-Grain complex question Decomposition (CGDe) strategy are introduced to decompose complex question into simple ones under the condition of without any additional annotations (2) Fine-Grained Interaction (FGIn) strategy are introduced to better represent each word in the document and extract more comprehensive and accurate sentences related to the inference path.
In this paper we propose a new model architecture for multi-hop question answering, by applying two completion strategies: (1) Coarse-Grain complex question Decomposition (CGDe) strategy are introduced to decompose complex question into simple ones under the condition of without any additional annotations (2) Fine-Grained Interaction (FGIn) strategy are introduced to better represent each word in the document and extract more comprehensive and accurate sentences related to the inference path.
0.91
The above two strategies are combined and tested on the SQuAD and HotpotQA datasets, and the experimental results show that our method outperforms state-of-the-art baselines.
1 Introduction One of the long-standing goals of natural language processing (NLP) is to build systems capable of reasoning about the information present in text.
Several different QA datasets have been proposed, such as the Stanford Question Answering Dataset (SQuAD) [8,9], NarrativeQA [10] and CoQA[11], and this kind of reasoning is termed single-hop reasoning, since it requires reasoning over a single piece of evidence.
Recent advances regarding QA and MRC have surpassed human performance on some single-hop datasets, but those datasets have gaps from real-world scenarioes.
A more challenging and real-world application task, called multi-hop reasoning [12], requires combining evidence from multiple sources, which means that evidence can be spread across multiple paragraphs.
the process of reasoning, a subset of these paragraphs may be read first to extract the useful information from the other paragraphs, which might otherwise be understood as not completely relevant to the question.
', ' The album was originally released in the United States on October 30, 2001… P2 'Khia Shamone Finch (born Khia Shamone Chambers, November 8, 1970), …' To date Khia has collectively sold over 2 million records worldwide.'
Q1 Who is the rapper whose debut album was titled ‘Thug Misses’?
q1 デビューアルバム『thug misses』のラッパーは誰ですか。
0.72
Q2 How many records has that rapper sold worldwide?
Q2ラッパーは世界中で何枚売れていますか。
0.60
Table 1: An example of a multi-hop question from HotpotQA.
表1: hotpotqaからのマルチホップ質問の例。
0.69
The first cell shows given complex question; at the bottom of the cell are two simple questions that have been solved.
最初の細胞は複雑な問題を示し、細胞の下部には解決された2つの単純な質問がある。
0.80
The second cell contains the supporting sentences (boldface part) needed to answer the question (support facts); the highlighted part is the final answer.
As shown in Table 1, the model with strong interpretability has the ability to find supporting facts (the boldface part in P1 and P2) of the answer while the answer itself is identified.
In a sense, the supporting facts predicted task is also a demonstration of the reasoning process.
ある意味では、支援事実予測タスクも推論プロセスの実証である。
0.59
Multi-hop QA faces two challenges.
マルチホップQAには2つの課題がある。
0.46
The first is the difficulty of reasoning due to the complexity of the query.
ひとつは、クエリの複雑さによる推論の難しさです。
0.64
For this challenge, some embedding-based models used to decompose query or generate query (Min et al , 2018[15]; Qi et al , 2019[16]) have been proposed, it is easier to find answers by breaking down complex questions into simple ones; for example, the question in Table 1 can be decomposed into two subquestions “Who is the rapper whose debut album was titled ‘Thug Misses’?” and “How many records has that rapper sold worldwide?”, but most existing work decomposes questions using a combination of rule-based algorithms, hand-crafted heuristics, and learning from supervised decompositions, each of which require significant human effort.
For this challenge, some embedding-based models used to decompose query or generate query (Min et al , 2018[15]; Qi et al , 2019[16]) have been proposed, it is easier to find answers by breaking down complex questions into simple ones; for example, the question in Table 1 can be decomposed into two subquestions “Who is the rapper whose debut album was titled ‘Thug Misses’?” and “How many records has that rapper sold worldwide?”, but most existing work decomposes questions using a combination of rule-based algorithms, hand-crafted heuristics, and learning from supervised decompositions, each of which require significant human effort.
0.86
The second challenge is the interpretability of the model.
2つめの課題はモデルの解釈可能性です。
0.65
Jiang et al [17] pointed-out that models can directly locate the answer by word-matching the question with a sentence in the context, in which examples contain reasoning shortcuts.
Jiang et al [17] は、モデルが質問を文脈内の文で単語マッチングすることで、直接答えを見つけることができることを指摘している。
0.71
Then, finding all the supporting facts (inference paths) is equally important for multi-hop inference tasks.
To solve these two problems, the decomposition of complex queries and fine-grained feature interactions between documents and query are considered important for models based on semantic features.
Inspired by the existing model proposed by Min et al [15], we propose two novel completion strategies called the Coarse-Grain Decomposition (CGDe) strategy and Fine-Grained Interaction (FGIn) strategy.
The CGDe is used to achieve better predictive capacity and explainability for question decomposition without any additional annotations, and the FGIn is used to better represent each word in the document which helps the model extract more comprehensive and accurate sentences needed to answer the question.
Different from previous works, we aims to use lightweight models instead of using off-the-shelf grammatical tools to perform grammatical processing such as named entity recognition for the construction of graph networks.
Because any model that removes documents which are not related to
なぜなら、関係のない文書を削除するあらゆるモデル
0.76
英語(論文から抽出)
日本語訳
スコア
queries will definitely improve the model effect, we are not committed to filtering irrelevant documents in advance, but seek to control the amount of passage information in the hidden representations directly.
To summarize, the key contributions are three-fold: (1) The coarse-grained complex question decomposition strategy decomposes the complex queries into simple queries without any additional annotations.
(2) The fine-grained interaction strategy is used to extract more comprehensive and accurate sentences related to the inference path (3) Our model is validated on multi-hop QA and single-hop QA datasets, and the experimental results show that the model can preserve or even surpass the original system in the objective evaluations, in addition to enhancing the interpretability of the reasoning process.
2 Related Work Single-hop Question Answering Most MRC datasets require single-hop reasoning only, which means that the evidence necessary to answer the question is concentrated in a single sentence or clustered tightly in a single paragraph.
The SQuAD [8] contains questions which are relatively simple because they are usually required no more than one sentence in a single paragraph to answer.
SQuAD 2.0[9] introduces questions that are designed to be unanswerable.
SQuAD 2.0[9]では、解決不可能な質問が紹介されている。
0.60
Bi-DAF (Seo et al , 2016) [18] and FastQA (Weissenborn et al , 2017) [19], which are popular for single-hop QA, the Query2Context and Context2Query modules in the Bi-DAF model are widely used in other QA models as core components.
Bi-DAF (Seo et al , 2016) [18] と FastQA (Weissenborn et al , 2017) [19] はシングルホップQAで人気があり、Bi-DAFモデルのQuery2ContextとContext2Queryモジュールはコアコンポーネントとして他のQAモデルで広く使われている。
0.79
However, these models suffer dramatic accuracy declines in multi-hop QA task.
しかし、これらのモデルはマルチホップQAタスクにおいて劇的に精度が低下する。
0.56
Multi-hop Question Answering In general, two research directions have been explored to solve the multi-hop and multi-document QA task.
Zhong et al (2019) [20] proposed a model combination coarsegrained reading and fine-grained reading.
Zhong et al (2019) [20] は粗粒読みと細粒読みを組み合わせたモデルを提案した。
0.75
Query Focused Extractor model proposed by Nishida et al (2019) [21] regards evidence extraction as a query-focused summarization task, and reformulates the query in each hop.
For complex questions, from the perspective of imitating human thinking, decomposing complex questions into simple subquestions is an effective method, Jiang and Bansel.
[22] proposed a model for multi-hop QA, four atomic neural modules are designed, namely Find, Relocate, Compare, NoOp, where four neural modules were dynamically assembled to make multi-hop reasoning and support fact selection more interpretable.
However, their system approaches question decomposition by having a decomposer model trained via human labels.
しかし、そのシステムは人間のラベルで訓練されたデコンポザモデルによって問題分解にアプローチする。
0.60
A subset of approaches has introduced end-to-end frameworks explicitly designed to emulate the stepby-step reasoning process involved in multi-hop QA and MRC.
The Kundu et al [23] model constructs paths connecting questions and candidate answers and subsequently scores them through a neural architecture.
kundu et al [23]モデルは、質問と候補回答を繋ぐ経路を構築し、その後、神経アーキテクチャを通じてそれらをスコア付けする。 訳抜け防止モード: Kundu et al [23 ] モデルは質問と候補回答を結ぶ経路を構成する その後 神経アーキテクチャを通して スコアを付けます
0.80
Jiang et al [24] also constructed a proposer used to proposes an answer from every root-toleaf path in the reasoning tree, and the Evidence Assembler extracts a key sentence containing the proposed answer from every path and combines them to predict the final answer.
Jiang et al [24] はまた、推論ツリー内のすべてのルートトリーフパスからの回答を提案するために使用されるプロジェクタを構築し、Evidence Assembler は提案された回答を含むキー文を各パスから抽出し、それらを組み合わせて最終回答を予測する。
0.76
英語(論文から抽出)
日本語訳
スコア
The other direction is based on graph neural networks (GNNs) [25].
別の方向はグラフニューラルネットワーク(GNN) [25] に基づいている。
0.79
GNNs have been shown to be successful on many NLP tasks, and recent papers have also examined complex QA using graph neural networks, including graph attention networks, graph recurrent networks, graph convolutional networks and their variants [26,27,28].
Cao et al [29] proposed a bi-directional attention mechanism that was combined with an entity graph convolutional network to obtain the relation-aware representation of nodes for entity graphs.
Qiu et al [30] used a recurrent decoder that guides a dynamic exploration of Wikipedia links among passages to build an “evidence trail” leading to passage with the answer span.
The multilevel graph network can represent the information in the text in more detail, so the hierarchical graph network proposed by Fang et al , 2019[31] leverages a hierarchical graph representation of the background knowledge (i.e., question, paragraphs, sentences, and entities).
Tu et al [32] constructed a graph connecting sentences that are part of the same document, share noun-phrases and have named entities or noun phrases in common with the question, and then applied a GNN to the graph to rank the top entity as the answer.
Tu et al [32]は、同じ文書の一部である文を連結し、名詞句を共有し、その質問に共通する名前付きエンティティや名詞句を持つグラフを構築し、そのグラフにGNNを適用して、上位エンティティを答えとしてランク付けする。
0.77
However, these approaches often fail to adequately capture the inherent structure of documents and discard masses of valuable structural information when transforming documents into graphs.
Documents unrelated to the complex query may affect the accuracy of the model.
複雑なクエリとは無関係なドキュメントはモデルの精度に影響する可能性がある。
0.63
In the “select, answer, and explain” (SAE) model proposed by Tu et al [33], BERT [34] acts as the encoder in the selection module.
Tu et al [33] によって提案された “select, answer, and explain” (SAE) モデルでは、BERT [34] が選択モジュールのエンコーダとして機能する。
0.86
Then a sentence extractor is applied to the output of BERT to obtain the sequential output of each sentence with precalculated sentence start and end indices, to filter out answer-unrelated documents and thus reduce the amount of distraction information.
The selected answer-related documents are then input to a model, which jointly predicts the answer and supporting sentences.
選択された回答関連文書はモデルに入力され、その回答と支援文を共同で予測する。
0.70
Concurrently to the SAE model, Bhargav et al.
SAEモデルと並行して、Bhargav et al。
0.77
[35] used a two-stage BERT-based architecture to first select the supporting sentence and then used the filtered supporting sentence to predict the answer.
Output: Answer Type AT (label), Answer String AS (text), Supporting facts (multiple texts)
出力: Answer Type AT (label), Answer String AS (text), Supporting facts (multiple texts)
0.76
Table 2: Symbol definition As shown in Table 2, context C and query Q have T words and J words respectively, where C is regarded as one connected text.
The answer type AT is selected from the answer candidates, such as ‘yes/no/span’.
回答タイプATは、‘yes/no/span’などの回答候補から選択される。
0.80
The answer string AS is a short span in context, which is determined by predicting the positions of the start token and the end token when there are not enough answer candidates to answer Q.
Supporting facts consist of one more than sentences in C and is required to answer Q.
支持する事実はcの文より1つ以上あり、qに答える必要がある。
0.59
英語(論文から抽出)
日本語訳
スコア
4 Model 4.1 Overview Our intuition is drawn from the human reasoning process for QA, and we propose a Coarse-grain Decomposition Fine-grain interaction (CGDe-FGIn) model.
4 Model 4.1 概観 我々の直観はQAの人間の推論過程から導き, 粗粒分解微細粒相互作用(CGDe-FGIn)モデルを提案する。
0.85
The model mainly consists of context and question embedding layer, contextual embedding layer, coarse-grained decomposition layer, finegrained interaction layer, modeling layer and output layer.
Context Question Figure 1: Overview of the CGDe-FGIn architecture
文脈 質問 図1:CGDe-FGInアーキテクチャの概要
0.71
4.2 Context and Question Embedding Layer We use a pre-trained word embedding model and a char embedding model to lay the foundation for CGDe-FGIn model.
Following Yang et al 2018[13] we use pre-trained word vectors in the form of GloVe (Pennington et al , 2014[36]) to obtain the fixed word embedding of each word, and we obtain the character level embedding of each word using convolutional neural networks (CNNs).
Yang et al 2018[13]に続いて、GloVe(Pennington et al , 2014[36])という形式で事前訓練された単語ベクトルを用いて各単語の固定語埋め込みを取得し、畳み込みニューラルネットワーク(CNN)を用いて各単語の文字レベル埋め込みを得る。
0.73
The concatenation of the character and word embedding vectors is passed to a two-layer highway network (Srivastava et al , 2015[37]).
文字と単語の埋め込みベクトルの結合は2層道路網(Srivastava et al , 2015[37])に渡される。
0.72
The outputs of the highway network are two sequences of d dimensional vectors, or more conveniently, two matrices
高速道路網の出力は、d次元ベクトルの2つの配列、またはより便利な2つの行列である
0.72
X∈ℝ𝑇𝑇×𝑑𝑑 for the context and Q∈ℝ𝐽𝐽×𝑑𝑑 for the query.
コンテクストはX・RTT×dd、クエリはQ・RJJ×dd。
0.57
where T and J are the numbers of words in
T と J が語数である場合
0.56
英語(論文から抽出)
日本語訳
スコア
multiple documents and queries respectively, and d is the dimension after fusion of the word embedding and character level embedding.
複数の文書とクエリそれぞれ、dは単語埋め込みと文字レベルの埋め込みの融合後の次元である。
0.73
4.3 Contextual Embedding Layer We use bi-directional recurrent neural networks with gated recurrent units (GRUs) (Cho et al , 2014[38]) to encode the contextual information present in the query and multiple context paragraphs separately.
4.3 コンテキスト埋め込みレイヤ ゲートリカレントユニット(GRU)を備えた双方向リカレントニューラルネットワーク(Cho et al , 2014[38])を用いて、クエリに存在するコンテキスト情報を別々に符号化する。
0.69
The outputs of the query and document encoders are U∈ℝ𝐽𝐽×2𝑑𝑑 and H∈ℝ𝑇𝑇×2𝑑𝑑, respectively.
Here, 2d denotes the output dimension of the encoders.
ここでは 2dはエンコーダの出力次元を表す。
0.74
Note that each column vector of H and U has dimension 2d because of the concatenation of the outputs of the forward and backward GRUs, each with ddimensional output.
H と U の各列ベクトルが次元 2d を持つのは、それぞれが d 次元の出力を持つ前向き GRU と後向き GRU の出力の連結性のためである。
0.79
4.4 Coarse-grained Decomposition Layer Coarse-grained Decomposition layer is responsible for decomposing complex questions and generating new question high-dimensional vectors.
4.4 粗粒状分解層 粗粒状分解層は複雑な質問を分解し、新しい質問高次元ベクトルを生成する。
0.70
Similarity matrix computatione First, a semantic similarity matrix is calculated for question(U) and multiple documents (H)as described
類似度行列計算 まず、質問(U)と複数の文書(H)について意味的類似度行列を算出する。
0.75
t-th context word and j-th query word.
t-th コンテキストワードと j-th クエリワード。
0.59
The similarity matrix is computed by:
類似度行列は次のように計算される。
0.56
by Yang et al [13].
Yangらによる[13]。
0.60
Semantic similarity matrix S∈ℝ𝑇𝑇×𝐽𝐽, where Stj indicates the similarity between the h= linear(H) , h∈ℝ𝑇𝑇×1 (1) u= permute(linear(U)), u∈ℝ1×𝐽𝐽 (2) α(H , U)= 𝑈𝑈⊤𝐻𝐻 , α(H , U)∈ℝ𝑇𝑇×𝐽𝐽 (3) Stj =[ h+u+ α(H , U) ], Stj∈ℝ𝑇𝑇×𝐽𝐽 (4) where linear indicates a linear layer, permute represents vectors dimension transformation operations, ⊤
indicates matrix transpose. Inspired by human hop-by-hop reasoning behavior, the meaning of complex questions decomposition is to make the high-dimensional vector distribution of entity nouns or pronouns more inclined to the intermediate answer to the question.
For example, "The rapper whose debut album was titled "Thug Misses" has sold over how many records worldwide?”, this relatively complex question can be decomposed into two subquestions, “Who is the rapper whose debut album was titled ‘Thug Misses’?” and “How many records has that rapper sold worldwide?”.
例えば、"thug misses"というタイトルでデビューアルバムを発売したラッパーは、世界中で何枚以上のレコードを販売しているのか?"という質問は、この比較的複雑な質問は、"who is the rapper who whose debut album was title ‘thug misses’?"と"how many records has that rapper sold worldwide?
0.87
Therefore, the answer to the first subquestion is crucial to answering the second question.
したがって、第1審問の答えは第2審問の答えに不可欠である。
0.80
In answering complex questions, high-dimensional vectors for nouns such as "The Rapper" are expected to be more similar to intermediate answers required to answer the complex questions, such as "by America Rapper Khia."
複雑な質問に答える際、"The Rapper"のような名詞の高次元ベクトルは、"by America Rapper Khia"のような複雑な質問に答えるのに必要な中間的な答えとよりよく似ていると期待されている。
0.73
This is a disguised decomposition of a complex query.
これは複雑なクエリの偽りの分解である。
0.80
To understand this point better, we transpose the Stj matrix to obtain 𝑆𝑆̃𝑗𝑗𝑗𝑗.
この点をよりよく理解するために、ssj 行列を変換して ssjjjjj を得る。
0.59
As shown in Fig 2, the attention aj: = softmax (𝑆𝑆̃𝑗𝑗:), aj: ∈ℝ𝑇𝑇 (5) 𝑄𝑄� =𝐻𝐻⊤a, 𝑄𝑄� ∈ℝ𝐽𝐽×2𝑑𝑑 (6)
To preserve 𝑄𝑄�=β(U; 𝑄𝑄�), 𝑄𝑄�∈ℝ𝐽𝐽×2𝑑𝑑 (7) β(U;𝑄𝑄�) = W(S)[ U; 𝑄𝑄�; U°𝑄𝑄�] (8) where W(S)∈ℝ6𝑑𝑑 is a trainable weight vector, ° represents elementwise multiplication, [;] represents We obtain 𝑄𝑄� , which is the integration of the original query and decomposed query, repeat the similarity
Here, ℎ� is tiled T times across the column, thus giving 𝐻𝐻�∈ℝ𝑇𝑇×2𝑑𝑑 , as shown in Fig 4.
ここで、h はカラム全体で T 回タイル化され、図 4 に示すように HH ∈RTT×2dd を与える。
0.66
The vanilla Query2Context module has two main deficiencies.
vanilla Query2Contextモジュールには2つの主要な欠陥がある。
0.61
First, the maximum function (max col) is performed across the column, and words that are consistent with the context in the question have a higher weight, such as the words "rapper" and "whose" in Fig 5.
As a result, constituting middle answer words needed to answer complex questions, are easy to ignore, therefore, the original Query2Context model not perform well in supporting facts predicted task.
Second, since the size of the vector output of the vanilla Query2Context module is (batch size, 1, 2d), it needs to be repeated T times to obtain the vector of the same size as the input document, to meet the requirements of the vector size of subsequent model input.
However, T times of repeated operations also result in the same high-dimensional vectors characteristics for each word in the contextual embedding of the context.
The output layer of the model classifies the word vector characteristics of each word in the context to evaluate the starting and ending positions of the answer; such output of the vanilla Query2Context is clearly not favorable to the subsequent model.
The model obtains J vector matrices of size (T, 2d), where J is the number of words in the question, and where each matrix indicates the correlation between all words in the context and the
The similarity matrix 𝑆𝑆̅ between the contextual embeddings of the context (H) and the new query (𝑄𝑄�) is computed by: 𝑞𝑞�= permute (linear (𝑄𝑄�)), 𝑞𝑞�∈ℝ1×𝐽𝐽 (10) 𝑆𝑆̅tj = [ h+𝑞𝑞�+ α (H, 𝑄𝑄�)], 𝑆𝑆̅tj∈ℝ𝑇𝑇×𝐽𝐽 (11) the attention weight 𝑎𝑎� is computed by: 𝑎𝑎�: j =softmax (𝑆𝑆̅: j), 𝑎𝑎�:j∈ℝ𝑇𝑇 (12) The fine-grained Query2Context representation 𝑈𝑈� is computed by: :𝑗𝑗°𝐻𝐻 , 𝑈𝑈� ∈ℝ𝑇𝑇×2𝑑𝑑 (13)
The similarity matrix 𝑆𝑆̅ between the contextual embeddings of the context (H) and the new query (𝑄𝑄�) is computed by: 𝑞𝑞�= permute (linear (𝑄𝑄�)), 𝑞𝑞�∈ℝ1×𝐽𝐽 (10) 𝑆𝑆̅tj = [ h+𝑞𝑞�+ α (H, 𝑄𝑄�)], 𝑆𝑆̅tj∈ℝ𝑇𝑇×𝐽𝐽 (11) the attention weight 𝑎𝑎� is computed by: 𝑎𝑎�: j =softmax (𝑆𝑆̅: j), 𝑎𝑎�:j∈ℝ𝑇𝑇 (12) The fine-grained Query2Context representation 𝑈𝑈� is computed by: :𝑗𝑗°𝐻𝐻 , 𝑈𝑈� ∈ℝ𝑇𝑇×2𝑑𝑑 (13)
0.96
𝑈𝑈�=∑𝑎𝑎�𝑗𝑗
𝑈𝑈�=∑𝑎𝑎�𝑗𝑗
0.59
J Max T x a m t f o S
J マックス T x a m t f o S
0.82
T d . . . Repeat T times
T d . . . 繰り返し T 回
0.83
. . . Figure 4: Vanilla Query2Context
. . . 図4:Vanilla Query2Context
0.84
英語(論文から抽出)
日本語訳
スコア
T J x a m t f o S
T J x a m t f o S
0.85
Figure 5: Heatmap of the semantic similarity matrix
図5:意味的類似性行列のヒートマップ
0.81
d T J The Context2Query attention signifies which query words are most relevant to each context word, which is computed by:
d T J Context2Query attentionは、どのクエリワードがそれぞれのコンテキストワードに最も関連があるかを示す。
0.82
Figure 6: Fine-grained interaction
図6:きめ細かい相互作用
0.90
Finally, the contextual embeddings and the feature vectors computed by the fine-grained interaction layer are combined together to yield G:
4.6 Modeling Layer The output G of the fine-grained QueryTcontext layer is taken as input to the modeling layer, which encodes the query-aware representations of context words.
Since multiple documents contain thousands of words, the long-distance dependency problem is obvious, so a self-attention module is added to alleviate this problem.
Similar to the baseline model, we use the original Bi-DAF function to implement self-attention, in which the input is changed from (query, context) to (context, context).
4.7 Prediction Layer We follow the same structure of prediction layers as (Yang et al , 2018[13]).
4.7 予測層 予測層の構造は(yang et al , 2018[13])と同じである。
0.76
To solve the degradation problem of the deep neural network, residual connections are made between the output of the fine-grained
ディープニューラルネットワークの劣化問題を解決するために、細粒度の出力間で残差接続が行われる
0.77
英語(論文から抽出)
日本語訳
スコア
QueryTcontext layer and the output of the modeling layer, which is the input to the prediction layer.
QueryTcontextレイヤと、予測レイヤへの入力であるモデリングレイヤの出力。
0.45
Within the prediction layer, four isomorphic Bi-GRUs are stacked layer by layer, and we adopt a cascade structure to solve the output dependency problem and avoid information loss.
The prediction layer has four output dimensions: 1. supporting sentences, 2. the start position of the answer, 3. the end position of the answer, and 4. the answer type.
We evaluate our model on development sets in the distractor setting, following prior work.
我々は,事前作業に追従した開発セットの気晴らし設定におけるモデルを評価する。
0.70
For the full wiki setting where all Wikipedia articles are given as input, we consider the bottleneck to be about information retrieval, thus we do not include the full wiki setting in our experiments.
The typical length of the paragraphs is approximately 250 and the question is 10 tokens although there are exceptionally long cases.
記事の典型的長さは約250で、疑問は10のトークンであるが、例外的に長いケースがある。
0.73
The SQuAD dataset is mainly used to verify the validity and universality of the model components we propose, namely coarse-grained decomposition strategy and fine-grained interaction strategy.
According to the observations from our experiments and previous works, the validation score is well correlated with the test score.
実験結果と過去の研究結果によると,検証スコアはテストスコアとよく相関している。
0.64
5.2 Implementation Details We keep the baseline (Bi-DAF) parameter settings on the two data sets to prove that our model components and model architecture have absolute performance advantages over the baseline.
A dropout (Srivastava et al , 2014[39]) rate of 0.2 is used for the CNN and LSTM layers, and the linear transformation before the softmax for the answers.
Srivastava et al , 2014[39]) のドロップアウト率は CNN および LSTM 層で, 解のソフトマックス前の線形変換では 0.2 である。
0.67
During training,
訓練中。
0.58
英語(論文から抽出)
日本語訳
スコア
the moving averages of all weights of the model are maintained with an exponential decay rate of 0.999.
モデルの全重量の移動平均は指数的減衰率0.999で維持される。
0.75
The training process takes approximately 6 hours on a single 2080 ti GPU.
トレーニングプロセスは、1つの2080 Ti GPUで約6時間かかる。
0.83
5.3 Main Results Model Comparison We compare the results with those of two types of baseline model.
5.3 主な結果モデルの比較 結果と2種類のベースラインモデルの比較。
0.78
One is the model with Bi-DAF as the core component.
ひとつは、Bi-DAFをコアコンポーネントとするモデルです。
0.73
Questions and documents are not processed by off-the-shelf language tools, but only contextual embedding is performed.
質問や文書は既製の言語ツールでは処理されないが、コンテキスト埋め込みのみ実行される。
0.69
This type of models is dedicated mainly to the feature interaction between questions and documents.
このタイプのモデルは、主に質問と文書間の機能相互作用に特化している。
0.74
The advantages of these models are fewer model parameters, short training time, and low GPU computing power requirements.
The other is the reasoning model based on a graph neural network.
もう1つは、グラフニューラルネットワークに基づく推論モデルである。
0.77
This type of model usually uses a language model or tool for named entity recognition to construct an entity graph, and then a graph convolutional neural network is used to update the node representation on the entity graph.
F1 20.32 59.02 62.71 - 50.81 65.75 50.89±0.13 65.41±0.18 39.47±0.46
F1 20.32 59.02 62.71 - 50.81 65.75 50.89±0.13 65.41±0.18 39.47±0.46
0.32
38.74 76.69 22.40
38.74 76.69 22.40
0.47
64.49 -
64.49 -
0.71
EM F1 52.82 79.83±0.14
EM F1 52.82 79.83±0.14
0.65
Our Model 23.08±0.39 54.51±0.29 Table 3: The performance of our CGDe-FGIn model and competing approaches by Yang et al , and Ye et al , Jiang et al on the HotpotQA dataset.
私たちのモデル 23.08±0.39 54.51±0.29 Table 3: CGDe-FGInモデルのパフォーマンスと、Yang et al と Ye et al による競合するアプローチ。
0.72
The performance of mul-hop QA on HotpotQA is evaluated by using the exact match (EM) and F1 as two evaluation metrics.
The performance of mul-hop QA on HotpotQA by using the exact Match (EM) and F1 as two evaluation metrics。
0.79
To assess the explainability of the models, the datasets further introduce two sets of database metrics involving the supporting facts.
The second set features joint metrics that combine the evaluation of answer spans and supporting facts.
第2のセットは、回答スパンの評価と事実のサポートを組み合わせたジョイントメトリクスを備えている。
0.63
All metrics are evaluated example-by-example, and then averaged over examples in the evaluation set.
すべてのメトリクスは例によって評価され、評価セットの例よりも平均される。
0.66
We compare our approach with several previously published models, and present our results in Table 3.
これまでのいくつかのモデルと比較し,その結果を表3に示す。
0.68
All experiments are performed for each of our models, and the table shows the mean and standard deviation.
実験はすべてモデルごとに行われ、表は平均偏差と標準偏差を示しています。
0.76
As shown in the table, all the results of our proposed model are superior to those of the baseline model in the case that the model parameters are not increased substantially.
4.4 Ablations Studies In this paper, we design two strategies for multi-hop Question Answering.
4.4 アブレーション研究 本稿では,マルチホップ質問応答のための2つの戦略を設計する。
0.70
To study the contributions of these two strategies to the performance of our model, we conduct an ablation experiment by removing coarse-grained decomposition strategy or fine-grained interaction strategy on the SQuAD1.1 and HotpotQA datasets.
Table 4: Ablation results on the HotpotQA dev set.
表4: HotpotQA開発セットのアブレーション結果。
0.68
Model Baseline Model
モデル ベースラインモデル
0.77
EM F1 75.51 76.93 75.96 77.06 Table 5: Ablation results on the SQuAD dev set.
EM F1 75.51 76.93 75.96 77.06 Table 5: SQuAD開発セットのアブレーション結果。
0.76
64.56 66.32 65.25 66.44
64.56 66.32 65.25 66.44
0.45
FGIn CGDe CGDe / FGQTC
FGIn CGDe CGDe/FGQTC
0.84
As shown in Tables 4 and 5, removing either the CGDe or the FGIn strategy reduces the effectiveness of the model, which demonstrates that both strategies contribute to our model.
Moreover, using either strategy individually enables our model to achieve better results than the baseline model.
さらに、どちらの戦略も個別に使用することで、ベースラインモデルよりも優れた結果が得られる。
0.65
Analysis and Visualization In this section, we conduct a series of visual analyses with different settings using our approach.
分析と可視化 この節では、アプローチを用いて異なる設定で一連の視覚的分析を行う。
0.80
Coarse-grained decomposition The coarse-grain decomposition module uses the similarity matrix of the query and the document to be multiplied by the document representation to obtain a new query representation (J, 2d).
After merging with the original query representation, the new query representation should have higher semantic similarity with the document's corresponding words, for example, the phrase " The rapper " and the word "Khia" in the complex question "The rapper whose debut album was titled ‘Thug Misses’ has sold over how many records worldwide?
例えば "the rapper" というフレーズと "khia" という単語は,複雑な質問 "the rapper who debut album was named 'thug misses' という題で,世界中のレコード数で販売されている。 訳抜け防止モード: 元のクエリ表現とマージした後、新しいクエリ表現はドキュメントの対応する単語とよりセマンティックな類似性を持つべきである。 例えば、複雑な質問 "The rapper" と "Khia" というフレーズ "'Thug Misses ' と題されたデビューアルバムのラッパーは、世界中で何枚のレコードを売り上げたのか?
0.81
". Q1 Who is the rapper whose debut album was titled ‘Thug Misses’?
". q1 デビューアルバム『thug misses』のラッパーは誰ですか。
0.64
Support fact one:Thug Misses is the debut album by American rapper Khia.
Q2 How many records has that rapper sold worldwide?
Q2ラッパーは世界中で何枚売れていますか。
0.60
Support fact two:To date Khia has collectively sold over 2 million records worldwide.
Khiaはこれまでに全世界で200万件以上のレコードを売り上げている。
0.53
Table 6: Subquestion and Supporting facts As the subquestion and supporting facts shown in Table 6, we hope that the phrase "The rapper" and the word "Khia" have more similar expressions, so that complex queries become simple one-hop queries: " The rapper (Khia) whose debut album was titled ‘Thug Misses’has sold over how many records worldwide ".
To confirm our idea, we use the baseline trained model and our model to process the validation set and generate the heat map of the attention matrix (the darker the color in the figure, the higher is the similarity weight), respectively.
In the baseline model's heat map, the attention weights of the phrase "The rapper" and the word "Khia" are not high, it is worth noting that this is caused by the similarity of the parts of speech between the two phrases, the part of speech of "rapper" is a noun, while the part of speech of "Khia" is a person's name, resulting in a slightly higher correlation between the two phrases.
Different from the baseline model, the heat map of our model shows that the semantic similarity of the phrase "The rapper" and the word "Khia" is significantly higher than that of other surrounding words.
This shows that the new question contains the subanswers that appear in the text to a certain extent, so that the multi-hop query is decomposed into a simple single-hop query.
Figure 7: Attention heat map of the baseline model
図7:ベースラインモデルの注意熱マップ
0.71
Figure 8: Attention heat map of our model
図8:私たちのモデルの注意熱マップ
0.91
In the ablation study, it can be easily found that the coarse-grained decomposition module improves the EM and F1 of the answer in evaluation metrics; compared with the fine-grained interaction model, Sup Facts's EM and F1 have lower improvement.
This shows that the model's ability to predict support facts is limited, because the new question generated contains the intermediate answer required for the first subquestion, so the support context that answers the first question may not be predicted as a supporting fact.
Fine-grained interaction As shown in Table 4, the fine-grained interaction strategy performs well on the supporting facts task, which further proves that the strategy can model more appropriate semantic features represented by a high-dimensional vector for individual words in multiple documents.
To make this more intuitive, we visually present the instances in HotpotQA datasets.
これをより直感的にするために、HotpotQAデータセットにインスタンスを視覚的に提示する。
0.57
According to the previous section, the complex query in Table 1 requires two supporting fact sentences, “Thug Misses is the debut album by American rapper Khia.” and “To date Khia has collectively sold over 2 million records worldwide.” Fig.
前節によると、Table 1の複雑なクエリには、”Thug Misses is the debut album by American rapper Khia”と“今日までKhiaは全世界で200万枚以上のレコードを売り上げている”という2つの事実文が必要だ。
0.78
9, (a) and (b) subgraph show heatmaps of the semantic similarity matrix of the baseline model (BiDAF), showing the part of the complex query corresponding to the supporting fact sentence.
Similarly, subfigures (c) and (d) show the same part of our model with the fine-grained interaction strategy.
同様に、サブフィギュア (c) と (d) は、きめ細かい相互作用戦略でモデルの同じ部分を示す。
0.77
Compared with the baseline model, the supporting fact sentences in our model have a higher weight in multiple documents.
ベースラインモデルと比較すると,提案モデルにおけるファクト文の重み付けは複数の文書において高い。
0.81
6 Conclusion and Future Work In this paper, we propose a mutli-hop question answering model, that contains a coarse-grained decomposition strategy to divide a complex query into multiple single-hop simple queries and a fine-
In the experiments, we show that our models significantly and consistently outperform the baseline model.
実験では,我々のモデルがベースラインモデルよりも大きく,一貫して優れていることを示す。
0.72
In the future, we think that the following issues would be worth studying: In Fine-grained interaction layer, assigning different weights to J context representations corresponding to each word in a complex query instead of adding them together can further improve our model.
2017. Reading wikipedia to answer open domain questions.
2017. オープンドメインの質問に答えるためにwikipediaを読む。
0.77
In ACL. [2] Xiong C, Zhong V, Socher R. Dynamic coattention networks for question answering[J].
ACL。 [2] Xiong C, Zhong V, Socher R. 質問応答のための動的被覆ネットワーク[J]
0.69
arXiv preprint arXiv:1611.01604, 2016.
arXiv preprint arXiv:1611.01604, 2016
0.80
[3] Cui Y, Chen Z, Wei S, et al.
[3] cui y, chen z, wei s, et al。
0.53
Attention-over-Atten tion Neural Networks for Reading Comprehension[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).
第55回計算言語学会年次大会(第1巻:長文論文)読解のための注意過剰ニューラルネットワーク
0.52
2017: 593-602.
2017: 593-602.
0.84
[4] Huang H Y, Zhu C, Shen Y, et al Fusionnet: Fusing via fully-aware attention with application to machine comprehension[J].
[4] Huang H Y, Zhu C, Shen Y, et al Fusionnet: 完全に認識された注意と機械理解への応用による融合[J]。
0.88
arXiv preprint arXiv:1711.07341, 2017.
arXiv preprint arXiv:1711.07341, 2017
0.79
[5] Huang M, Zhu X, Gao J.
[5]Huang M, Zhu X, Gao J。
0.73
Challenges in building intelligent open-domain dialog systems[J].
インテリジェントなオープンドメインダイアログシステムの構築 [J]。
0.67
ACM Transactions on Information Systems (TOIS), 2020, 38(3): 1-32.
ACM Transactions on Information Systems (TOIS) 2020, 38(3): 1-32。
0.78
[6] Chen H, Liu X, Yin D, et al A survey on dialogue systems: Recent advances and new frontiers[J].
[6]Chen H, Liu X, Yin D, et al 対話システムに関する調査:最近の進歩と新たなフロンティア[J]
[7] Liu N, Shen B. ReMemNN: A novel memory neural network for powerful interaction in aspect-based sentiment analysis[J].
[7] Liu N, Shen B. ReMemNN: アスペクトベースの感情分析における強力なインタラクションのための新しいメモリニューラルネットワーク[J]。
0.83
Neurocomputing, 2020.
神経科学、2020年。
0.61
[8] Rajpurkar, P.; Zhang, J.; Lopyrev, K.; and Liang, P. 2016.
[8]Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P. 2016
0.70
Squad: 100,000+ questions for machine comprehension of text.
Squad: 機械によるテキスト理解のための10万以上の質問。
0.59
In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2383–2392.
2016年の自然言語処理における経験的手法に関する会議の議事録2383-2392。
0.65
[9] Rajpurkar P, Jia R, Liang P. Know What You Don’t Know: Unanswerable Questions for SQuAD[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers).
9]Rajpurkar P, Jia R, Liang P. Know You Don’t Know: Unanswerable Questions for SQuAD[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
0.79
2018: 784-789.
2018: 784-789.
0.84
[10] Koˇcisk`y, T.; Schwarz, J.; Blunsom, P.; Dyer, C.; Hermann, K. M.; Melis, G.; and Grefenstette, E. 2018.
[10]Ko'cisk`y, T., Schwarz, J., Blunsom, P., Dyer, C., Hermann, K. M., Melis, G., Grefenstette, E. 2018。
0.82
The narrativeqa reading comprehension challenge.
物語を読むことの難しさ。
0.55
Transactions of the Association of Computational Linguistics 6:317–328.
計算言語学協会の取引6:317-328。
0.65
[11] Reddy, S.; Chen, D.; and Manning, C. D. 2019.
11]Reddy, S.; Chen, D.; and Manning, C.D. 2019。
0.80
Coqa: A conversational question answering challenge.
Coqa: 会話による質問応答の課題です。
0.74
Transactions of the Association for Computational Linguistics 7:249–266.
計算言語学会(Association for Computational Linguistics 7:249–266)の略。
0.51
[12] Lin X V, Socher R, Xiong C. Multi-hop knowledge graph reasoning with reward shaping[J].
[12]Lin X V, Socher R, Xiong C.Multi-hop knowledge graph reasoning with reward shaping[J]。
0.86
arXiv preprint arXiv:1808.10568, 2018.
arXiv preprint arXiv:1808.10568, 2018
0.79
[13] Zhilin Yang, Peng Qi, Saizheng Zhang, Yoshua Bengio, William Cohen, Ruslan Salakhutdinov, and
[13]Zhilin Yang, Peng Qi, Saizheng Zhang, Yoshua Bengio, William Cohen, Ruslan Salakhutdinov, and
0.84
英語(論文から抽出)
日本語訳
スコア
Christopher D Manning.
クリストファー・d・マニング
0.43
2018. Hotpotqa: A dataset for diverse, explainable multi-hop question answering.
2018. Hotpotqa: 多様なマルチホップ質問応答のためのデータセット。
0.82
In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2369–2380.
2018年のProceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, page 2369–2380。
0.78
[14] Welbl J, Stenetorp P, Riedel S. Constructing datasets for multi-hop reading comprehension across documents[J].
[14] Welbl J, Stenetorp P, Riedel S. Constructing datasets for multi-hop reading comprehension across document[J]。
0.86
Transactions of the Association for Computational Linguistics, 2018, 6: 287-302.
transactions of the association for computational linguistics, 2018, 6: 287-302 (英語)
0.79
[15] Sewon Min, Victor Zhong, Luke Zettlemoyer, and Hannaneh Hajishirzi.
2017. Bidirectional attention flow for machine comprehension.
2017. 機械理解のための双方向注意流
0.82
In Proceedings of the International Conference on Learning Representations.
International Conference on Learning Representations に参加して
0.67
[19] Weissenborn D, Wiese G, Seiffe L. Fastqa: A simple and efficient neural architecture for question answering[J].
[19] Weissenborn D, Wiese G, Seiffe L. Fastqa: 質問応答のためのシンプルで効率的なニューラルネットワーク。
0.72
arXiv preprint arXiv:1703.04816, 2017.
arXiv preprint arXiv:1703.04816, 2017
0.79
[20] Victor Zhong, Caiming Xiong, Nitish Shirish Keskar, and Richard Socher.
Victor Zhong、Caiming Xiong、Nitish Shirish Keskar、Richard Socher。
0.54
2019. Coarse-grain finegrain coattention network for multi-evidence question answering.
2019. マルチエビデンス質問応答のための粗粒微細粒塗布網
0.71
In ICLR. [21] Nishida K, Nishida K, Nagata M, et al Answering while Summarizing: Multi-task Learning for Multi-hop QA with Evidence Extraction[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.
[22] Jiang Y, Bansal M. Self-Assembling Modular Networks for Interpretable Multi-Hop Reasoning[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLPIJCNLP).
[22]Jiang Y,Bansal M. Self-Assembling Modular Networks for Interpretable Multi-Hop Reasoning[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLPIJCNLP)。
0.88
2019: 4464-4474.
2019: 4464-4474.
0.84
[23] Kundu S, Khot T, Sabharwal A, et al.
23] kundu s, khot t, sabharwal a, et al.
0.59
Exploiting Explicit Paths for Multi-hop Reading Comprehension[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.
計算言語学会第57回年次大会「マルチホップ読解 [C]//Proceeds for the Multi-hop Reading Comprehension」の実施
0.74
2019: 2737-2747.
2019: 2737-2747.
0.84
[24] Jiang Y, Joshi N, Chen Y C, et al Explore, Propose, and Assemble: An Interpretable Model for Multi-Hop Reading Comprehension[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.
[24]Jiang Y, Joshi N, Chen Y C, et al Explore, Propose, and Assemble: An Interpretable Model for Multi-Hop Reading Comprehension[C]//Proceeds of the57th Annual Meeting of the Association for Computational Linguistics。
0.84
2019: 2714-2725.
2019: 2714-2725.
0.84
[25] Xu K, Hu W, Leskovec J, et al.
[25]Xu K, Hu W, Leskovec J, その他。
0.64
How powerful are graph neural networks?[J].
グラフニューラルネットワークはどの程度強力か?
0.51
arXiv preprint arXiv:1810.00826, 2018.
arXiv preprint arXiv:1810.00826, 2018
0.79
[26] Veličković P, Cucurull G, Casanova A, et al.
26] veličković p, cucurull g, casanova a, et al。
0.59
Graph attention networks[J].
グラフアテンションネットワーク[J]。
0.60
arXiv preprint arXiv:1710.10903, 2017.
arXiv preprint arXiv:1710.10903, 2017
0.80
[27] Hajiramezanali E, Hasanzadeh A, Narayanan K, et al.
27] hajiramezanali e, hasanzadeh a, narayanan k, et al。
0.54
Variational graph recurrent neural networks[C]//Advances in neural information processing systems.
[28] Kipf T N, Welling M. Semi-supervised classification with graph convolutional networks[J].
[28]Kipf T N, Welling M. グラフ畳み込みネットワークを用いた半教師付き分類[J]
0.81
arXiv preprint arXiv:1609.02907, 2016.
arXiv preprint arXiv:1609.02907, 2016
0.80
[29] Cao Y, Fang M, Tao D. BAG: Bi-directional Attention Entity Graph Convolutional Network for Multi-hop Reasoning Question Answering[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers).
[29] Cao Y, Fang M, Tao D. BAG: 双方向注意 Entity Graph Convolutional Network for Multi-hop Reasoning Question Answering[C]//Proceedings of the North American Conference of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
0.79
2019: 357-362.
2019: 357-362.
0.84
英語(論文から抽出)
日本語訳
スコア
J, Socher R, Manning C D. Glove: Global vectors
J, Socher R, Manning C D. Glove: 地球ベクトル
0.85
[30] Qiu L, Xiao Y, Qu Y, et al.
30] qiu l, xiao y, qu y, et al。
0.62
Dynamically fused graph network for multi-hop reasoning[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.
マルチホップ推論のための動的融合グラフネットワーク [C]//計算言語学会第57回年次大会報告
0.60
2019: 6140-6150.
2019: 6140-6150.
0.84
[31] Y. Fang, S. Sun, Z. Gan, R. Pillai, S. Wang, and J. Liu.
[31] Y. Fang, S. Sun, Z. Gan, R. Pillai, S. Wang, J. Liu
0.95
Hierarchical graph network for multi-hop question answering.
マルチホップ質問応答のための階層グラフネットワーク
0.77
arXiv preprint arXiv:1911.03631, 2019.
arXiv preprint arXiv:1911.03631, 2019
0.81
[32] Tu M, Wang G, Huang J, et al Multi-hop Reading Comprehension across Multiple Documents by Reasoning over Heterogeneous Graphs[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.
[32]Tu M, Wang G, Huang J, et al Multi-hop Reading Comprehension across multiple Documents by Reasoning over Heterogeneous Graphs[C]//Proceeds of the57th Annual Meeting of the Association for Computational Linguistics。
0.88
2019: 2704-2713.
2019: 2704-2713.
0.84
[33] Tu M, Huang K, Wang G, et al Select, Answer and Explain: Interpretable Multi-Hop Reading Comprehension over Multiple Documents[C]//AAAI.
[33] Tu M, Huang K, Wang G, et al Select, Answer and Explain: Interpretable Multi-Hop Reading Comprehension over Multiple Documents[C]//AAAI。
0.87
2020: 9073-9080.
2020: 9073-9080.
0.84
[34] Devlin J, Chang M W, Lee K, et al Bert: Pre-training of deep bidirectional transformers for language understanding[J].
[34] Devlin J, Chang M W, Lee K, et al Bert: 言語理解のための双方向トランスフォーマーの事前トレーニング[J]。
0.80
arXiv preprint arXiv:1810.04805, 2018.
arXiv preprint arXiv:1810.04805, 2018
0.79
[35] Bhargav G P S, Glass M, Garg D, et al Translucent Answer Predictions in Multi-Hop Reading Comprehension[C]//Proceedings of the AAAI Conference on Artificial Intelligence.
[35]Bhargav G P S, Glass M, Garg D, et al Translucent Answer Predictions in Multi-Hop Reading Comprehension[C]//Proceeds of the AAAI Conference on Artificial Intelligence。
0.85
2020, 34(05): 77007707.
2020, 34(05): 77007707.
0.85
[36] Pennington for word representation[C]//Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP).
[36]Pennington for word representation[C]//Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP)。