Related papers: Translating synthetic natural language to database queries: a polyglot deep learning framework

Translating synthetic natural language to database queries: a polyglot deep learning framework

URL: http://arxiv.org/abs/2104.07010v1
Date: Wed, 14 Apr 2021 17:43:51 GMT
Title: Translating synthetic natural language to database queries: a polyglot deep learning framework
Authors: Adri\'an Bazaga and Nupur Gunwant and Gos Micklem
Abstract summary: Polyglotter supports the mapping of natural language searches to database queries. It does not require the creation of manually annotated data for training. Our results indicate that our framework performs well on both synthetic and real databases.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The number of databases as well as their size and complexity is increasing. This creates a barrier to use especially for non-experts, who have to come to grips with the nature of the data, the way it has been represented in the database, and the specific query languages or user interfaces by which data are accessed. These difficulties worsen in research settings, where it is common to work with many different databases. One approach to improving this situation is to allow users to pose their queries in natural language. In this work we describe a machine learning framework, Polyglotter, that in a general way supports the mapping of natural language searches to database queries. Importantly, it does not require the creation of manually annotated data for training and therefore can be applied easily to multiple domains. The framework is polyglot in the sense that it supports multiple different database engines that are accessed with a variety of query languages, including SQL and Cypher. Furthermore Polyglotter also supports multi-class queries. Our results indicate that our framework performs well on both synthetic and real databases, and may provide opportunities for database maintainers to improve accessibility to their resources.

Related papers

SM3-Text-to-Query: Synthetic Multi-Model Medical Text-to-Query Benchmark [4.049028351548513]
Different database models have a big impact on query complexity and performance. We present SM3-Text-to-Query, the first multi-model medical Text-to-Query benchmark.
arXiv Detail & Related papers (2024-11-08T12:27:13Z)
Text2SQL is Not Enough: Unifying AI and Databases with TAG [47.45480855418987]
Table-Augmented Generation (TAG) is a paradigm for answering natural language questions over databases. We develop benchmarks to study the TAG problem and find that standard methods answer no more than 20% of queries correctly.
arXiv Detail & Related papers (2024-08-27T00:50:14Z)
RB-SQL: A Retrieval-based LLM Framework for Text-to-SQL [48.516004807486745]
Large language models (LLMs) with in-context learning have significantly improved the performance of text-to- task. We propose RB-, a novel retrieval-based framework for in-context prompt engineering. Experiment results demonstrate that our model achieves better performance than several competitive baselines on public datasets BIRD and Spider.
arXiv Detail & Related papers (2024-07-11T08:19:58Z)
UQE: A Query Engine for Unstructured Databases [71.49289088592842]
We investigate the potential of Large Language Models to enable unstructured data analytics. We propose a new Universal Query Engine (UQE) that directly interrogates and draws insights from unstructured data collections.
arXiv Detail & Related papers (2024-06-23T06:58:55Z)
DBCopilot: Natural Language Querying over Massive Databases via Schema Routing [47.009638761948466]
We present DBCopilot, a framework that addresses challenges by employing a compact and flexible copilot model for routing over massive databases. This framework utilizes a single lightweight differentiable search index to construct semantic mappings for massive database schemata, and navigates natural language questions to their target databases and tables in a relation joint retrieval manner.
arXiv Detail & Related papers (2023-12-06T12:37:28Z)
SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended) [53.95151604061761]
This paper introduces the framework for enhancing Text-to- filtering using large language models (LLMs) With few-shot prompting, we explore the effectiveness of consistency decoding with execution-based error analyses. With instruction fine-tuning, we delve deep in understanding the critical paradigms that influence the performance of tuned LLMs.
arXiv Detail & Related papers (2023-05-26T21:39:05Z)
Interactive Text-to-SQL Generation via Editable Step-by-Step Explanations [31.3376894001311]
We introduce a new interaction mechanism that allows users to directly edit a step-by-step explanation of a query to fix errors. Our experiments on multiple datasets, as well as a user with 24 participants, demonstrate that our approach can achieve better than multiple SOTA approaches.
arXiv Detail & Related papers (2023-05-12T10:45:29Z)
AskYourDB: An end-to-end system for querying and visualizing relational databases using natural language [0.0]
We propose a semantic parsing approach to address the challenge of converting complex natural language into SQL. We modified state-of-the-art models, by various pre and post processing steps which make the significant part when a model is deployed in production. To make the product serviceable to businesses we added an automatic visualization framework over the queried results.
arXiv Detail & Related papers (2022-10-16T13:31:32Z)
Dual Reader-Parser on Hybrid Textual and Tabular Evidence for Open Domain Question Answering [78.9863753810787]
A large amount of world's knowledge is stored in structured databases. query languages can answer questions that require complex reasoning, as well as offering full explainability.
arXiv Detail & Related papers (2021-08-05T22:04:13Z)
"What Do You Mean by That?" A Parser-Independent Interactive Approach for Enhancing Text-to-SQL [49.85635994436742]
We include human in the loop and present a novel-independent interactive approach (PIIA) that interacts with users using multi-choice questions. PIIA is capable of enhancing the text-to-domain performance with limited interaction turns by using both simulation and human evaluation.
arXiv Detail & Related papers (2020-11-09T02:14:33Z)
Towards a Natural Language Query Processing System [0.0]
This paper reports our study on the design and development of a natural language query interface to a backend relational database. The novelty in the study lies in defining a graph database as a middle layer to store necessary metadata needed to transform a natural language query into structured query language. The translation results for some sample queries yielded a 90% accuracy rate.
arXiv Detail & Related papers (2020-09-25T19:52:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.