Integrating SPARQL and LLMs for Question Answering over Scholarly Data   Sources
        - URL: http://arxiv.org/abs/2409.18969v1
- Date: Wed, 11 Sep 2024 14:50:28 GMT
- Title: Integrating SPARQL and LLMs for Question Answering over Scholarly Data   Sources
- Authors: Fomubad Borista Fondi, Azanzi Jiomekong Fidel, 
- Abstract summary: This paper describes a methodology that combines SPARQL queries, divide and conquer algorithms, and BERT-based-case-SQuad2 predictions.
The approach, evaluated with Exact Match and F-score metrics, shows promise for improving QA accuracy and efficiency in scholarly contexts.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   The Scholarly Hybrid Question Answering over Linked Data (QALD) Challenge at International Semantic Web Conference (ISWC) 2024 focuses on Question Answering (QA) over diverse scholarly sources: DBLP, SemOpenAlex, and Wikipedia-based texts. This paper describes a methodology that combines SPARQL queries, divide and conquer algorithms, and BERT-based-case-SQuad2 predictions. It starts with SPARQL queries to gather data, then applies divide and conquer to manage various question types and sources, and uses BERT to handle personal author questions. The approach, evaluated with Exact Match and F-score metrics, shows promise for improving QA accuracy and efficiency in scholarly contexts. 
 
      
        Related papers
        - Text-to-SPARQL Goes Beyond English: Multilingual Question Answering Over   Knowledge Graphs through Human-Inspired Reasoning [51.203811759364925]
 mKGQAgent breaks down the task of converting natural language questions into SPARQL queries into modular, interpretable subtasks.<n> Evaluated on the DBpedia- and Corporate-based KGQA benchmarks within the Text2SPARQL challenge 2025, our approach took first place among the other participants.
 arXiv  Detail & Related papers  (2025-07-22T19:23:03Z)
- The benefits of query-based KGQA systems for complex and temporal   questions in LLM era [55.20230501807337]
 Large language models excel in question-answering (QA) yet still struggle with multi-hop reasoning and temporal questions.<n> Query-based knowledge graph QA (KGQA) offers a modular alternative by generating executable queries instead of direct answers.<n>We explore multi-stage query-based framework for WikiData QA, proposing multi-stage approach that enhances performance on challenging multi-hop and temporal benchmarks.
 arXiv  Detail & Related papers  (2025-07-16T06:41:03Z)
- Q${}^2$Forge: Minting Competency Questions and SPARQL Queries for   Question-Answering Over Knowledge Graphs [6.6757601046766135]
 The SPARQL query language is the standard method to access knowledge graphs (KGs)<n>Best practices recommend to document KGs with competency questions and example queries.<n>Q$2$Forge addresses the challenge of generating new competency questions for a KG and corresponding SPARQL queries.
 arXiv  Detail & Related papers  (2025-05-19T13:26:51Z)
- PeerQA: A Scientific Question Answering Dataset from Peer Reviews [51.95579001315713]
 We present PeerQA, a real-world, scientific, document-level Question Answering dataset.
The dataset contains 579 QA pairs from 208 academic articles, with a majority from ML and NLP.
We provide a detailed analysis of the collected dataset and conduct experiments establishing baseline systems for all three tasks.
 arXiv  Detail & Related papers  (2025-02-19T12:24:46Z)
- MODS: Moderating a Mixture of Document Speakers to Summarize Debatable   Queries in Document Collections [57.588478932185005]
 We introduce Debatable QFS, a task to create summaries that answer queries via documents with opposing perspectives.
We design MODS, a multi-LLM framework mirroring human panel discussions.
Experiments on ConflictingQA with controversial web queries and DebateQFS, our new dataset of debate queries from Debatepedia, show MODS beats SOTA by 38-59% in topic paragraph coverage and balance.
 arXiv  Detail & Related papers  (2025-02-01T05:08:14Z)
- RAGONITE: Iterative Retrieval on Induced Databases and Verbalized RDF   for Conversational QA over KGs with RAG [6.4032082023113475]
 SPARQL is brittle for complex intents and conversational questions.
We propose a novel two-pronged system where we fuse: (i) SPARQL results over a database automatically derived from the knowledge graph, and (ii) text-search results over verbalizations of KG facts.
Our pipeline supports iterative retrieval: when the results of any branch are found to be unsatisfactory, the system can automatically opt for further rounds.
We demonstrate the superiority of our proposed system over several baselines on a knowledge graph of BMW automobiles.
 arXiv  Detail & Related papers  (2024-12-23T16:16:30Z)
- Contri(e)ve: Context + Retrieve for Scholarly Question Answering [0.0]
 We present a two step solution using open source Large Language Model(LLM): Llama3.1 for Scholarly-QALD dataset.
 Firstly, we extract the context pertaining to the question from different structured and unstructured data sources.
 Secondly, we implement prompt engineering to improve the information retrieval performance of the LLM.
 arXiv  Detail & Related papers  (2024-09-13T17:38:47Z)
- S-EQA: Tackling Situational Queries in Embodied Question Answering [48.43453390717167]
 We present and tackle the problem of Embodied Question Answering with Situational Queries (S-EQA) in a household environment.
We first introduce a novel Prompt-Generate-Evaluate scheme that wraps around an LLM's output to create a dataset of unique situational queries and corresponding consensus object information.
We report an improved accuracy of 15.31% while using queries framed from the generated object consensus for Visual Question Answering (VQA) over directly answering situational ones.
 arXiv  Detail & Related papers  (2024-05-08T00:45:20Z)
- Leveraging LLMs in Scholarly Knowledge Graph Question Answering [7.951847862547378]
 KGQA answers natural language questions by leveraging a large language model (LLM)
Our system achieves an F1 score of 99.0% on SciQA - one of the Scholarly Knowledge Graph Question Answering challenge benchmarks.
 arXiv  Detail & Related papers  (2023-11-16T12:13:49Z)
- NLQxform: A Language Model-based Question to SPARQL Transformer [8.698533396991554]
 This paper presents a question-answering (QA) system called NLQxform.
 NLQxform allows users to express their complex query intentions in natural language questions.
A transformer-based language model, i.e., BART, is employed to translate questions into standard SPARQL queries.
 arXiv  Detail & Related papers  (2023-11-08T21:41:45Z)
- DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain
  Question Answering over Knowledge Base and Text [73.68051228972024]
 Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when relying on their internal knowledge.
Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge.
 arXiv  Detail & Related papers  (2023-10-31T04:37:57Z)
- Semantic Parsing for Conversational Question Answering over Knowledge
  Graphs [63.939700311269156]
 We develop a dataset where user questions are annotated with Sparql parses and system answers correspond to execution results thereof.
We present two different semantic parsing approaches and highlight the challenges of the task.
Our dataset and models are released at https://github.com/Edinburgh/SPICE.
 arXiv  Detail & Related papers  (2023-01-28T14:45:11Z)
- UniKGQA: Unified Retrieval and Reasoning for Solving Multi-hop Question
  Answering Over Knowledge Graph [89.98762327725112]
 Multi-hop Question Answering over Knowledge Graph(KGQA) aims to find the answer entities that are multiple hops away from the topic entities mentioned in a natural language question.
We propose UniKGQA, a novel approach for multi-hop KGQA task, by unifying retrieval and reasoning in both model architecture and parameter learning.
 arXiv  Detail & Related papers  (2022-12-02T04:08:09Z)
- PACIFIC: Towards Proactive Conversational Question Answering over
  Tabular and Textual Data in Finance [96.06505049126345]
 We present a new dataset, named PACIFIC. Compared with existing CQA datasets, PACIFIC exhibits three key features: (i) proactivity, (ii) numerical reasoning, and (iii) hybrid context of tables and text.
A new task is defined accordingly to study Proactive Conversational Question Answering (PCQA), which combines clarification question generation and CQA.
UniPCQA performs multi-task learning over all sub-tasks in PCQA and incorporates a simple ensemble strategy to alleviate the error propagation issue in the multi-task learning by cross-validating top-$k$ sampled Seq2Seq
 arXiv  Detail & Related papers  (2022-10-17T08:06:56Z)
- FeTaQA: Free-form Table Question Answering [33.018256483762386]
 We introduce FeTaQA, a new dataset with 10K Wikipedia-based table, question, free-form answer, supporting table cells pairs.
FeTaQA yields a more challenging table question answering setting because it requires generating free-form text answers after retrieval, inference, and integration of multiple discontinuous facts from a structured knowledge source.
 arXiv  Detail & Related papers  (2021-04-01T09:59:40Z)
- Open Question Answering over Tables and Text [55.8412170633547]
 In open question answering (QA), the answer to a question is produced by retrieving and then analyzing documents that might contain answers to the question.
Most open QA systems have considered only retrieving information from unstructured text.
We present a new large-scale dataset Open Table-and-Text Question Answering (OTT-QA) to evaluate performance on this task.
 arXiv  Detail & Related papers  (2020-10-20T16:48:14Z)
- Tradeoffs in Sentence Selection Techniques for Open-Domain Question
  Answering [54.541952928070344]
 We describe two groups of models for sentence selection: QA-based approaches, which run a full-fledged QA system to identify answer candidates, and retrieval-based models, which find parts of each passage specifically related to each question.
We show that very lightweight QA models can do well at this task, but retrieval-based models are faster still.
 arXiv  Detail & Related papers  (2020-09-18T23:39:15Z)
- Generating Diverse and Consistent QA pairs from Contexts with
  Information-Maximizing Hierarchical Conditional VAEs [62.71505254770827]
 We propose a conditional variational autoencoder (HCVAE) for generating QA pairs given unstructured texts as contexts.
Our model obtains impressive performance gains over all baselines on both tasks, using only a fraction of data for training.
 arXiv  Detail & Related papers  (2020-05-28T08:26:06Z)
- RuBQ: A Russian Dataset for Question Answering over Wikidata [3.394278383312621]
 RuBQ is the first Russian knowledge base question answering (KBQA) dataset.
The high-quality dataset consists of 1,500 Russian questions of varying complexity, their English machine translations, SPARQL queries to Wikidata, reference answers, and a Wikidata sample of triples containing entities with Russian labels.
 arXiv  Detail & Related papers  (2020-05-21T14:06:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.