dIR -- Discrete Information Retrieval: Conversational Search over
Unstructured (and Structured) Data with Large Language Models
- URL: http://arxiv.org/abs/2312.13264v1
- Date: Wed, 20 Dec 2023 18:41:44 GMT
- Title: dIR -- Discrete Information Retrieval: Conversational Search over
Unstructured (and Structured) Data with Large Language Models
- Authors: Pablo M. Rodriguez Bertorello and Jean Rodmond Junior Laguerre
(Computer Science Department, Stanford University)
- Abstract summary: This paper introduces dIR, Discrete Information Retrieval, providing a unified interface to query both free text and structured knowledge.
We validate our approach via a proprietary question/answer data set, concluding dIR makes a whole new class of queries on free text possible.
- Score: 0.16060477887377675
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data is stored in both structured and unstructured form. Querying both, to
power natural language conversations, is a challenge. This paper introduces
dIR, Discrete Information Retrieval, providing a unified interface to query
both free text and structured knowledge. Specifically, a Large Language Model
(LLM) transforms text into expressive representation. After the text is
extracted into columnar form, it can then be queried via a text-to-SQL Semantic
Parser, with an LLM converting natural language into SQL. Where desired, such
conversation may be effected by a multi-step reasoning conversational agent. We
validate our approach via a proprietary question/answer data set, concluding
that dIR makes a whole new class of queries on free text possible when compared
to traditionally fine-tuned dense-embedding-model-based Information Retrieval
(IR) and SQL-based Knowledge Bases (KB). For sufficiently complex queries, dIR
can succeed where no other method stands a chance.
Related papers
- Text2SQL is Not Enough: Unifying AI and Databases with TAG [47.45480855418987]
Table-Augmented Generation (TAG) is a paradigm for answering natural language questions over databases.
We develop benchmarks to study the TAG problem and find that standard methods answer no more than 20% of queries correctly.
arXiv Detail & Related papers (2024-08-27T00:50:14Z) - UQE: A Query Engine for Unstructured Databases [71.49289088592842]
We investigate the potential of Large Language Models to enable unstructured data analytics.
We propose a new Universal Query Engine (UQE) that directly interrogates and draws insights from unstructured data collections.
arXiv Detail & Related papers (2024-06-23T06:58:55Z) - Semantic Parsing for Complex Data Retrieval: Targeting Query Plans vs.
SQL for No-Code Access to Relational Databases [2.933060994339853]
We investigate the potential of an alternative query language with simpler syntax and modular specification of complex queries.
The proposed alternative query language is called Query Plan Language (QPL)
We present ways to address the challenge of complex queries in an iterative, user-controlled manner.
arXiv Detail & Related papers (2023-12-22T16:16:15Z) - QURG: Question Rewriting Guided Context-Dependent Text-to-SQL Semantic
Parsing [46.05006486399823]
This paper presents QURG, a novel Question Rewriting Guided approach to help the models achieve adequate contextual understanding.
We first train a question rewriting model to complete the current question based on question context, and convert them into a rewriting edit matrix.
We further design a two-stream matrix encoder to jointly model rewriting relations between question and context, and the schema linking relations between natural language and structured schema.
arXiv Detail & Related papers (2023-05-11T08:45:55Z) - Prompting GPT-3.5 for Text-to-SQL with De-semanticization and Skeleton
Retrieval [17.747079214502673]
Text-to- is a task that converts a natural language question into a structured query language () to retrieve information from a database.
In this paper, we propose an LLM-based framework for Text-to- which retrieves helpful demonstration examples to prompt LLMs.
We design a de-semanticization mechanism that extracts question skeletons, allowing us to retrieve similar examples based on their structural similarity.
arXiv Detail & Related papers (2023-04-26T06:02:01Z) - A Survey on Text-to-SQL Parsing: Concepts, Methods, and Future
Directions [102.8606542189429]
The goal of text-to-corpora parsing is to convert a natural language (NL) question to its corresponding structured query language () based on the evidences provided by databases.
Deep neural networks have significantly advanced this task by neural generation models, which automatically learn a mapping function from an input NL question to an output query.
arXiv Detail & Related papers (2022-08-29T14:24:13Z) - Weakly Supervised Text-to-SQL Parsing through Question Decomposition [53.22128541030441]
We take advantage of the recently proposed question meaning representation called QDMR.
Given questions, their QDMR structures (annotated by non-experts or automatically predicted) and the answers, we are able to automatically synthesizesql queries.
Our results show that the weakly supervised models perform competitively with those trained on NL- benchmark data.
arXiv Detail & Related papers (2021-12-12T20:02:42Z) - Dual Reader-Parser on Hybrid Textual and Tabular Evidence for Open
Domain Question Answering [78.9863753810787]
A large amount of world's knowledge is stored in structured databases.
query languages can answer questions that require complex reasoning, as well as offering full explainability.
arXiv Detail & Related papers (2021-08-05T22:04:13Z) - "What Do You Mean by That?" A Parser-Independent Interactive Approach
for Enhancing Text-to-SQL [49.85635994436742]
We include human in the loop and present a novel-independent interactive approach (PIIA) that interacts with users using multi-choice questions.
PIIA is capable of enhancing the text-to-domain performance with limited interaction turns by using both simulation and human evaluation.
arXiv Detail & Related papers (2020-11-09T02:14:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.