Integrating SPARQL and LLMs for Question Answering over Scholarly Data Sources
- URL: http://arxiv.org/abs/2409.18969v2
- Date: Thu, 28 Nov 2024 20:29:44 GMT
- Title: Integrating SPARQL and LLMs for Question Answering over Scholarly Data Sources
- Authors: Fomubad Borista Fondi, Azanzi Jiomekong Fidel, Gaoussou Camara,
- Abstract summary: This paper describes a methodology that combines SPARQL queries, divide and conquer algorithms, and a pre-trained extractive question answering model.
It starts with SPARQL queries to gather data, then applies divide and conquer to manage various question types and sources, and uses the model to handle personal author questions.
The approach, evaluated with Exact Match and F-score metrics, shows promise for improving QA accuracy and efficiency in scholarly contexts.
- Score: 0.0
- License:
- Abstract: The Scholarly Hybrid Question Answering over Linked Data (QALD) Challenge at the International Semantic Web Conference (ISWC) 2024 focuses on Question Answering (QA) over diverse scholarly sources: DBLP, SemOpenAlex, and Wikipedia-based texts. This paper describes a methodology that combines SPARQL queries, divide and conquer algorithms, and a pre-trained extractive question answering model. It starts with SPARQL queries to gather data, then applies divide and conquer to manage various question types and sources, and uses the model to handle personal author questions. The approach, evaluated with Exact Match and F-score metrics, shows promise for improving QA accuracy and efficiency in scholarly contexts.
Related papers
- PeerQA: A Scientific Question Answering Dataset from Peer Reviews [51.95579001315713]
We present PeerQA, a real-world, scientific, document-level Question Answering dataset.
The dataset contains 579 QA pairs from 208 academic articles, with a majority from ML and NLP.
We provide a detailed analysis of the collected dataset and conduct experiments establishing baseline systems for all three tasks.
arXiv Detail & Related papers (2025-02-19T12:24:46Z) - MODS: Moderating a Mixture of Document Speakers to Summarize Debatable Queries in Document Collections [57.588478932185005]
We introduce Debatable QFS, a task to create summaries that answer queries via documents with opposing perspectives.
We design MODS, a multi-LLM framework mirroring human panel discussions.
Experiments on ConflictingQA with controversial web queries and DebateQFS, our new dataset of debate queries from Debatepedia, show MODS beats SOTA by 38-59% in topic paragraph coverage and balance.
arXiv Detail & Related papers (2025-02-01T05:08:14Z) - RAGONITE: Iterative Retrieval on Induced Databases and Verbalized RDF for Conversational QA over KGs with RAG [6.4032082023113475]
SPARQL is brittle for complex intents and conversational questions.
We propose a novel two-pronged system where we fuse: (i) SPARQL results over a database automatically derived from the knowledge graph, and (ii) text-search results over verbalizations of KG facts.
Our pipeline supports iterative retrieval: when the results of any branch are found to be unsatisfactory, the system can automatically opt for further rounds.
We demonstrate the superiority of our proposed system over several baselines on a knowledge graph of BMW automobiles.
arXiv Detail & Related papers (2024-12-23T16:16:30Z) - Contri(e)ve: Context + Retrieve for Scholarly Question Answering [0.0]
We present a two step solution using open source Large Language Model(LLM): Llama3.1 for Scholarly-QALD dataset.
Firstly, we extract the context pertaining to the question from different structured and unstructured data sources.
Secondly, we implement prompt engineering to improve the information retrieval performance of the LLM.
arXiv Detail & Related papers (2024-09-13T17:38:47Z) - S-EQA: Tackling Situational Queries in Embodied Question Answering [48.43453390717167]
We present and tackle the problem of Embodied Question Answering with Situational Queries (S-EQA) in a household environment.
We first introduce a novel Prompt-Generate-Evaluate scheme that wraps around an LLM's output to create a dataset of unique situational queries and corresponding consensus object information.
We report an improved accuracy of 15.31% while using queries framed from the generated object consensus for Visual Question Answering (VQA) over directly answering situational ones.
arXiv Detail & Related papers (2024-05-08T00:45:20Z) - Leveraging LLMs in Scholarly Knowledge Graph Question Answering [7.951847862547378]
KGQA answers natural language questions by leveraging a large language model (LLM)
Our system achieves an F1 score of 99.0% on SciQA - one of the Scholarly Knowledge Graph Question Answering challenge benchmarks.
arXiv Detail & Related papers (2023-11-16T12:13:49Z) - NLQxform: A Language Model-based Question to SPARQL Transformer [8.698533396991554]
This paper presents a question-answering (QA) system called NLQxform.
NLQxform allows users to express their complex query intentions in natural language questions.
A transformer-based language model, i.e., BART, is employed to translate questions into standard SPARQL queries.
arXiv Detail & Related papers (2023-11-08T21:41:45Z) - UniKGQA: Unified Retrieval and Reasoning for Solving Multi-hop Question
Answering Over Knowledge Graph [89.98762327725112]
Multi-hop Question Answering over Knowledge Graph(KGQA) aims to find the answer entities that are multiple hops away from the topic entities mentioned in a natural language question.
We propose UniKGQA, a novel approach for multi-hop KGQA task, by unifying retrieval and reasoning in both model architecture and parameter learning.
arXiv Detail & Related papers (2022-12-02T04:08:09Z) - FeTaQA: Free-form Table Question Answering [33.018256483762386]
We introduce FeTaQA, a new dataset with 10K Wikipedia-based table, question, free-form answer, supporting table cells pairs.
FeTaQA yields a more challenging table question answering setting because it requires generating free-form text answers after retrieval, inference, and integration of multiple discontinuous facts from a structured knowledge source.
arXiv Detail & Related papers (2021-04-01T09:59:40Z) - Open Question Answering over Tables and Text [55.8412170633547]
In open question answering (QA), the answer to a question is produced by retrieving and then analyzing documents that might contain answers to the question.
Most open QA systems have considered only retrieving information from unstructured text.
We present a new large-scale dataset Open Table-and-Text Question Answering (OTT-QA) to evaluate performance on this task.
arXiv Detail & Related papers (2020-10-20T16:48:14Z) - Generating Diverse and Consistent QA pairs from Contexts with
Information-Maximizing Hierarchical Conditional VAEs [62.71505254770827]
We propose a conditional variational autoencoder (HCVAE) for generating QA pairs given unstructured texts as contexts.
Our model obtains impressive performance gains over all baselines on both tasks, using only a fraction of data for training.
arXiv Detail & Related papers (2020-05-28T08:26:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.