Hybrid-SQuAD: Hybrid Scholarly Question Answering Dataset
- URL: http://arxiv.org/abs/2412.02788v2
- Date: Thu, 05 Dec 2024 10:30:56 GMT
- Title: Hybrid-SQuAD: Hybrid Scholarly Question Answering Dataset
- Authors: Tilahun Abedissa Taffa, Debayan Banerjee, Yaregal Assabie, Ricardo Usbeck,
- Abstract summary: We introduce Hybrid-SQuAD, a novel large-scale Scholarly Question Answering dataset.
The dataset consists of 10.5K question-answer pairs generated by a large language model.
We propose a RAG-based baseline hybrid QA model, achieving an exact match score of 69.65 on the Hybrid-SQuAD test set.
- Score: 8.867885891794877
- License:
- Abstract: Existing Scholarly Question Answering (QA) methods typically target homogeneous data sources, relying solely on either text or Knowledge Graphs (KGs). However, scholarly information often spans heterogeneous sources, necessitating the development of QA systems that integrate information from multiple heterogeneous data sources. To address this challenge, we introduce Hybrid-SQuAD (Hybrid Scholarly Question Answering Dataset), a novel large-scale QA dataset designed to facilitate answering questions incorporating both text and KG facts. The dataset consists of 10.5K question-answer pairs generated by a large language model, leveraging the KGs DBLP and SemOpenAlex alongside corresponding text from Wikipedia. In addition, we propose a RAG-based baseline hybrid QA model, achieving an exact match score of 69.65 on the Hybrid-SQuAD test set.
Related papers
- PeerQA: A Scientific Question Answering Dataset from Peer Reviews [51.95579001315713]
We present PeerQA, a real-world, scientific, document-level Question Answering dataset.
The dataset contains 579 QA pairs from 208 academic articles, with a majority from ML and NLP.
We provide a detailed analysis of the collected dataset and conduct experiments establishing baseline systems for all three tasks.
arXiv Detail & Related papers (2025-02-19T12:24:46Z) - HybGRAG: Hybrid Retrieval-Augmented Generation on Textual and Relational Knowledge Bases [36.46450059250384]
We propose HybGRAG for HQA consisting of a retriever bank and a critic module, with the following advantages.
In experiments on the STaRK benchmark, HybGRAG achieves significant performance gains, with an average relative improvement in Hit@1 of 51%.
arXiv Detail & Related papers (2024-12-20T19:49:12Z) - RAG-based Question Answering over Heterogeneous Data and Text [23.075485587443485]
This article presents the QUASAR system for question answering over unstructured text, structured tables, and knowledge graphs.
The system adopts a RAG-based architecture, with a pipeline of evidence retrieval followed by answer generation, with the latter powered by a moderate-sized language model.
Experiments with three different benchmarks demonstrate the high answering quality of our approach, being on par with or better than large GPT models.
arXiv Detail & Related papers (2024-12-10T11:18:29Z) - An Empirical Comparison of LM-based Question and Answer Generation
Methods [79.31199020420827]
Question and answer generation (QAG) consists of generating a set of question-answer pairs given a context.
In this paper, we establish baselines with three different QAG methodologies that leverage sequence-to-sequence language model (LM) fine-tuning.
Experiments show that an end-to-end QAG model, which is computationally light at both training and inference times, is generally robust and outperforms other more convoluted approaches.
arXiv Detail & Related papers (2023-05-26T14:59:53Z) - HeteroQA: Learning towards Question-and-Answering through Multiple
Information Sources via Heterogeneous Graph Modeling [50.39787601462344]
Community Question Answering (CQA) is a well-defined task that can be used in many scenarios, such as E-Commerce and online user community for special interests.
Most of the CQA methods only incorporate articles or Wikipedia to extract knowledge and answer the user's question.
We propose a question-aware heterogeneous graph transformer to incorporate the multiple information sources (MIS) in the user community to automatically generate the answer.
arXiv Detail & Related papers (2021-12-27T10:16:43Z) - Generating Self-Contained and Summary-Centric Question Answer Pairs via
Differentiable Reward Imitation Learning [7.2745835227138045]
We propose a model for generating question-answer pairs (QA pairs) with self-contained, summary-centric questions and length-constrained, article-summarizing answers.
This dataset is used to learn a QA pair generation model producing summaries as answers that balance brevity with sufficiency jointly with their corresponding questions.
arXiv Detail & Related papers (2021-09-10T06:34:55Z) - TAT-QA: A Question Answering Benchmark on a Hybrid of Tabular and
Textual Content in Finance [71.76018597965378]
We build a new large-scale Question Answering dataset containing both Tabular And Textual data, named TAT-QA.
We propose a novel QA model termed TAGOP, which is capable of reasoning over both tables and text.
arXiv Detail & Related papers (2021-05-17T06:12:06Z) - Generating Diverse and Consistent QA pairs from Contexts with
Information-Maximizing Hierarchical Conditional VAEs [62.71505254770827]
We propose a conditional variational autoencoder (HCVAE) for generating QA pairs given unstructured texts as contexts.
Our model obtains impressive performance gains over all baselines on both tasks, using only a fraction of data for training.
arXiv Detail & Related papers (2020-05-28T08:26:06Z) - Template-Based Question Generation from Retrieved Sentences for Improved
Unsupervised Question Answering [98.48363619128108]
We propose an unsupervised approach to training QA models with generated pseudo-training data.
We show that generating questions for QA training by applying a simple template on a related, retrieved sentence rather than the original context sentence improves downstream QA performance.
arXiv Detail & Related papers (2020-04-24T17:57:45Z) - HybridQA: A Dataset of Multi-Hop Question Answering over Tabular and
Textual Data [39.91331662575689]
We present HybridQA, a new large-scale question-answering dataset that requires reasoning on heterogeneous information.
Each question is aligned with a Wikipedia table and multiple free-form corpora linked with the entities in the table.
Tests show that the EM scores obtained by two baselines are below 20%, while the hybrid model can achieve an EM over 40%.
arXiv Detail & Related papers (2020-04-15T21:18:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.