Related papers: Dual Reader-Parser on Hybrid Textual and Tabular Evidence for Open Domain Question Answering

Dual Reader-Parser on Hybrid Textual and Tabular Evidence for Open Domain Question Answering

URL: http://arxiv.org/abs/2108.02866v1
Date: Thu, 5 Aug 2021 22:04:13 GMT
Title: Dual Reader-Parser on Hybrid Textual and Tabular Evidence for Open Domain Question Answering
Authors: Alexander Hanbo Li, Patrick Ng, Peng Xu, Henghui Zhu, Zhiguo Wang, Bing Xiang
Abstract summary: A large amount of world's knowledge is stored in structured databases. query languages can answer questions that require complex reasoning, as well as offering full explainability.
Score: 78.9863753810787
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The current state-of-the-art generative models for open-domain question answering (ODQA) have focused on generating direct answers from unstructured textual information. However, a large amount of world's knowledge is stored in structured databases, and need to be accessed using query languages such as SQL. Furthermore, query languages can answer questions that require complex reasoning, as well as offering full explainability. In this paper, we propose a hybrid framework that takes both textual and tabular evidence as input and generates either direct answers or SQL queries depending on which form could better answer the question. The generated SQL queries can then be executed on the associated databases to obtain the final answers. To the best of our knowledge, this is the first paper that applies Text2SQL to ODQA tasks. Empirically, we demonstrate that on several ODQA datasets, the hybrid methods consistently outperforms the baseline models that only take homogeneous input by a large margin. Specifically we achieve state-of-the-art performance on OpenSQuAD dataset using a T5-base model. In a detailed analysis, we demonstrate that the being able to generate structural SQL queries can always bring gains, especially for those questions that requires complex reasoning.

Related papers

Weaver: Interweaving SQL and LLM for Table Reasoning [63.09519234853953]
Weaver generates a flexible, step-by-step plan that combinessql for structured data retrieval with LLMs for semantic processing.<n>Weaver consistently outperforms state-of-the-art methods across four TableQA datasets, reducing both API calls and error rates.
arXiv Detail & Related papers (2025-05-25T03:27:37Z)
Datrics Text2SQL: A Framework for Natural Language to SQL Query Generation [0.0]
This paper introduces a Retrieval-Augmented Generation (RAG)-based framework designed to generate accuratesql queries by leveraging structured documentation, example-based learning, and domain-specific rules.<n>The paper details the architecture, training methodology, and retrieval logic, highlighting how the system bridges the gap between user intent and database structure without requiringsql expertise.
arXiv Detail & Related papers (2025-04-03T21:09:59Z)
PRACTIQ: A Practical Conversational Text-to-SQL dataset with Ambiguous and Unanswerable Queries [32.40808001281668]
Real user questions can often be ambiguous with multiple interpretations or unanswerable due to a lack of relevant data. In this work, we construct a practical conversational text-to-text dataset. We generate conversations with four turns: the initial user question, an assistant response seeking clarification, the user's clarification, and the assistant's clarified.
arXiv Detail & Related papers (2024-10-14T20:36:35Z)
UNITE: A Unified Benchmark for Text-to-SQL Evaluation [72.72040379293718]
We introduce a UNIfied benchmark for Text-to-domain systems. It is composed of publicly available text-to-domain datasets and 29K databases. Compared to the widely used Spider benchmark, we introduce a threefold increase in SQL patterns.
arXiv Detail & Related papers (2023-05-25T17:19:52Z)
QURG: Question Rewriting Guided Context-Dependent Text-to-SQL Semantic Parsing [46.05006486399823]
This paper presents QURG, a novel Question Rewriting Guided approach to help the models achieve adequate contextual understanding. We first train a question rewriting model to complete the current question based on question context, and convert them into a rewriting edit matrix. We further design a two-stream matrix encoder to jointly model rewriting relations between question and context, and the schema linking relations between natural language and structured schema.
arXiv Detail & Related papers (2023-05-11T08:45:55Z)
Prompting GPT-3.5 for Text-to-SQL with De-semanticization and Skeleton Retrieval [17.747079214502673]
Text-to- is a task that converts a natural language question into a structured query language () to retrieve information from a database. In this paper, we propose an LLM-based framework for Text-to- which retrieves helpful demonstration examples to prompt LLMs. We design a de-semanticization mechanism that extracts question skeletons, allowing us to retrieve similar examples based on their structural similarity.
arXiv Detail & Related papers (2023-04-26T06:02:01Z)
Towards Generalizable and Robust Text-to-SQL Parsing [77.18724939989647]
We propose a novel TKK framework consisting of Task decomposition, Knowledge acquisition, and Knowledge composition to learn text-to- parsing in stages. We show that our framework is effective in all scenarios and state-of-the-art performance on the Spider, SParC, and Co. datasets.
arXiv Detail & Related papers (2022-10-23T09:21:27Z)
Weakly Supervised Text-to-SQL Parsing through Question Decomposition [53.22128541030441]
We take advantage of the recently proposed question meaning representation called QDMR. Given questions, their QDMR structures (annotated by non-experts or automatically predicted) and the answers, we are able to automatically synthesizesql queries. Our results show that the weakly supervised models perform competitively with those trained on NL- benchmark data.
arXiv Detail & Related papers (2021-12-12T20:02:42Z)
Open Domain Question Answering over Virtual Documents: A Unified Approach for Data and Text [62.489652395307914]
We use the data-to-text method as a means for encoding structured knowledge for knowledge-intensive applications, i.e. open-domain question answering (QA) Specifically, we propose a verbalizer-retriever-reader framework for open-domain QA over data and text where verbalized tables from Wikipedia and triples from Wikidata are used as augmented knowledge sources. We show that our Unified Data and Text QA, UDT-QA, can effectively benefit from the expanded knowledge index, leading to large gains over text-only baselines.
arXiv Detail & Related papers (2021-10-16T00:11:21Z)
Data Augmentation with Hierarchical SQL-to-Question Generation for Cross-domain Text-to-SQL Parsing [40.65143087243074]
This paper presents a simple yet effective data augmentation framework. First, given a database, we automatically produce a large amount ofsql queries based on an abstract syntax tree grammar citeyintranx. Second, we propose a hierarchicalsql-to-question generation model to obtain high-quality natural language questions.
arXiv Detail & Related papers (2021-03-03T07:37:38Z)
Did You Ask a Good Question? A Cross-Domain Question Intention Classification Benchmark for Text-to-SQL [32.946103197082124]
Triage is the first cross-domain text-to-question classification benchmark. It requires models to distinguish four types of unanswerable questions from answerable questions. The baseline RoBERTa model achieves a 60% F1 score on the test set.
arXiv Detail & Related papers (2020-10-23T19:36:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.