HeteroQA: Learning towards Question-and-Answering through Multiple
Information Sources via Heterogeneous Graph Modeling
- URL: http://arxiv.org/abs/2112.13597v1
- Date: Mon, 27 Dec 2021 10:16:43 GMT
- Title: HeteroQA: Learning towards Question-and-Answering through Multiple
Information Sources via Heterogeneous Graph Modeling
- Authors: Shen Gao, Yuchi Zhang, Yongliang Wang, Yang Dong, Xiuying Chen,
Dongyan Zhao and Rui Yan
- Abstract summary: Community Question Answering (CQA) is a well-defined task that can be used in many scenarios, such as E-Commerce and online user community for special interests.
Most of the CQA methods only incorporate articles or Wikipedia to extract knowledge and answer the user's question.
We propose a question-aware heterogeneous graph transformer to incorporate the multiple information sources (MIS) in the user community to automatically generate the answer.
- Score: 50.39787601462344
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Community Question Answering (CQA) is a well-defined task that can be used in
many scenarios, such as E-Commerce and online user community for special
interests.
In these communities, users can post articles, give comment, raise a question
and answer it.
These data form the heterogeneous information sources where each information
source have their own special structure and context (comments attached to an
article or related question with answers).
Most of the CQA methods only incorporate articles or Wikipedia to extract
knowledge and answer the user's question.
However, various types of information sources in the community are not fully
explored by these CQA methods and these multiple information sources (MIS) can
provide more related knowledge to user's questions.
Thus, we propose a question-aware heterogeneous graph transformer to
incorporate the MIS in the user community to automatically generate the answer.
To evaluate our proposed method, we conduct the experiments on two datasets:
$\text{MSM}^{\text{plus}}$ the modified version of benchmark dataset MS-MARCO
and the AntQA dataset which is the first large-scale CQA dataset with four
types of MIS.
Extensive experiments on two datasets show that our model outperforms all the
baselines in terms of all the metrics.
Related papers
- TANQ: An open domain dataset of table answered questions [15.323690523538572]
TANQ is the first open domain question answering dataset where the answers require building tables from information across multiple sources.
We release the full source attribution for every cell in the resulting table and benchmark state-of-the-art language models in open, oracle, and closed book setups.
Our best-performing baseline, GPT4 reaches an overall F1 score of 29.1, lagging behind human performance by 19.7 points.
arXiv Detail & Related papers (2024-05-13T14:07:20Z) - NewsQs: Multi-Source Question Generation for the Inquiring Mind [59.79288644158271]
We present NewsQs, a dataset that provides question-answer pairs for multiple news documents.
To create NewsQs, we augment a traditional multi-document summarization dataset with questions automatically generated by a T5-Large model fine-tuned on FAQ-style news articles.
arXiv Detail & Related papers (2024-02-28T16:59:35Z) - S2M: Converting Single-Turn to Multi-Turn Datasets for Conversational
Question Answering [16.930522435912717]
We propose a novel method to convert single-turn datasets to multi-turn datasets.
S2M ranks 1st place on the QuAC leaderboard at the time of submission.
arXiv Detail & Related papers (2023-12-27T10:41:18Z) - UNK-VQA: A Dataset and a Probe into the Abstention Ability of Multi-modal Large Models [55.22048505787125]
This paper contributes a comprehensive dataset, called UNK-VQA.
We first augment the existing data via deliberate perturbations on either the image or question.
We then extensively evaluate the zero- and few-shot performance of several emerging multi-modal large models.
arXiv Detail & Related papers (2023-10-17T02:38:09Z) - MQAG: Multiple-choice Question Answering and Generation for Assessing
Information Consistency in Summarization [55.60306377044225]
State-of-the-art summarization systems can generate highly fluent summaries.
These summaries, however, may contain factual inconsistencies and/or information not present in the source.
We introduce an alternative scheme based on standard information-theoretic measures in which the information present in the source and summary is directly compared.
arXiv Detail & Related papers (2023-01-28T23:08:25Z) - Relation-Aware Language-Graph Transformer for Question Answering [21.244992938222246]
We propose Question Answering Transformer (QAT), which is designed to jointly reason over language and graphs with respect to entity relations.
Specifically, QAT constructs Meta-Path tokens, which learn relation-centric embeddings based on diverse structural and semantic relations.
We validate the effectiveness of QAT on commonsense question answering datasets like CommonsenseQA and OpenBookQA, and on a medical question answering dataset, MedQA-USMLE.
arXiv Detail & Related papers (2022-12-02T05:10:10Z) - Summarizing Community-based Question-Answer Pairs [5.680726650578754]
We propose the novel CQA summarization task that aims to create a concise summary from CQA pairs.
Our data and code are publicly available.
arXiv Detail & Related papers (2022-11-17T21:09:41Z) - Question Answering Survey: Directions, Challenges, Datasets, Evaluation
Matrices [0.0]
The research directions of QA field are analyzed based on the type of question, answer type, source of evidence-answer, and modeling approach.
This detailed followed by open challenges of the field like automatic question generation, similarity detection and, low resource availability for a language.
arXiv Detail & Related papers (2021-12-07T08:53:40Z) - SYGMA: System for Generalizable Modular Question Answering OverKnowledge
Bases [57.89642289610301]
We present SYGMA, a modular approach facilitating general-izability across multiple knowledge bases and multiple rea-soning types.
We demonstrate effectiveness of our system by evaluating on datasets belonging to two distinct knowledge bases,DBpedia and Wikidata.
arXiv Detail & Related papers (2021-09-28T01:57:56Z) - Generating Diverse and Consistent QA pairs from Contexts with
Information-Maximizing Hierarchical Conditional VAEs [62.71505254770827]
We propose a conditional variational autoencoder (HCVAE) for generating QA pairs given unstructured texts as contexts.
Our model obtains impressive performance gains over all baselines on both tasks, using only a fraction of data for training.
arXiv Detail & Related papers (2020-05-28T08:26:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.