A Scalable Space-efficient In-database Interpretability Framework for
Embedding-based Semantic SQL Queries
- URL: http://arxiv.org/abs/2302.12178v2
- Date: Fri, 24 Feb 2023 17:22:52 GMT
- Title: A Scalable Space-efficient In-database Interpretability Framework for
Embedding-based Semantic SQL Queries
- Authors: Prabhakar Kudva, Rajesh Bordawekar, Apoorva Nitsure
- Abstract summary: We introduce a new co-occurrence based interpretability approach to capture relationships between relational entities.
Our approach provides both query-agnostic (global) and query-specific (local) interpretabilities.
- Score: 3.0938904602244346
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: AI-Powered database (AI-DB) is a novel relational database system that uses a
self-supervised neural network, database embedding, to enable semantic SQL
queries on relational tables. In this paper, we describe an architecture and
implementation of in-database interpretability infrastructure designed to
provide simple, transparent, and relatable insights into ranked results of
semantic SQL queries supported by AI-DB. We introduce a new co-occurrence based
interpretability approach to capture relationships between relational entities
and describe a space-efficient probabilistic Sketch implementation to store and
process co-occurrence counts. Our approach provides both query-agnostic
(global) and query-specific (local) interpretabilities. Experimental evaluation
demonstrate that our in-database probabilistic approach provides the same
interpretability quality as the precise space-inefficient approach, while
providing scalable and space efficient runtime behavior (up to 8X space
savings), without any user intervention.
Related papers
- Interactive-T2S: Multi-Turn Interactions for Text-to-SQL with Large Language Models [9.914489049993495]
We introduce Interactive-T2S, a framework that generatessql queries through direct interactions with databases.
We have developed detailed exemplars to demonstrate the step-wise reasoning processes within our framework.
Our experiments on the BIRD-Dev dataset, employing a setting without oracle knowledge, reveal that our method achieves state-of-the-art results with only two exemplars.
arXiv Detail & Related papers (2024-08-09T07:43:21Z) - UQE: A Query Engine for Unstructured Databases [71.49289088592842]
We investigate the potential of Large Language Models to enable unstructured data analytics.
We propose a new Universal Query Engine (UQE) that directly interrogates and draws insights from unstructured data collections.
arXiv Detail & Related papers (2024-06-23T06:58:55Z) - CHESS: Contextual Harnessing for Efficient SQL Synthesis [1.9506402593665235]
We propose a new pipeline that retrieves relevant data and context, selects an efficient schema, and synthesizes correct and efficient queries.
Our method achieves new state-of-the-art performance on the cross-domain challenging BIRD dataset.
arXiv Detail & Related papers (2024-05-27T01:54:16Z) - Improving Text-to-SQL Semantic Parsing with Fine-grained Query
Understanding [84.04706075621013]
We present a general-purpose, modular neural semantic parsing framework based on token-level fine-grained query understanding.
Our framework consists of three modules: named entity recognizer (NER), neural entity linker (NEL) and neural entity linker (NSP)
arXiv Detail & Related papers (2022-09-28T21:00:30Z) - SUN: Exploring Intrinsic Uncertainties in Text-to-SQL Parsers [61.48159785138462]
This paper aims to improve the performance of text-to-dependence by exploring the intrinsic uncertainties in the neural network based approaches (called SUN)
Extensive experiments on five benchmark datasets demonstrate that our method significantly outperforms competitors and achieves new state-of-the-art results.
arXiv Detail & Related papers (2022-09-14T06:27:51Z) - Proton: Probing Schema Linking Information from Pre-trained Language
Models for Text-to-SQL Parsing [66.55478402233399]
We propose a framework to elicit relational structures via a probing procedure based on Poincar'e distance metric.
Compared with commonly-used rule-based methods for schema linking, we found that probing relations can robustly capture semantic correspondences.
Our framework sets new state-of-the-art performance on three benchmarks.
arXiv Detail & Related papers (2022-06-28T14:05:25Z) - BERT Meets Relational DB: Contextual Representations of Relational
Databases [4.029818252558553]
We address the problem of learning low dimension representation of entities on relational databases consisting of multiple tables.
We look into ways of using these attention-based model to learn embeddings for entities in the relational database.
arXiv Detail & Related papers (2021-04-30T11:23:26Z) - Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent
Semantic Parsing [52.24507547010127]
Cross-domain context-dependent semantic parsing is a new focus of research.
We present a dynamic graph framework that effectively modelling contextual utterances, tokens, database schemas, and their complicated interaction as the conversation proceeds.
The proposed framework outperforms all existing models by large margins, achieving new state-of-the-art performance on two large-scale benchmarks.
arXiv Detail & Related papers (2021-01-05T18:11:29Z) - Probabilistic Case-based Reasoning for Open-World Knowledge Graph
Completion [59.549664231655726]
A case-based reasoning (CBR) system solves a new problem by retrieving cases' that are similar to the given problem.
In this paper, we demonstrate that such a system is achievable for reasoning in knowledge-bases (KBs)
Our approach predicts attributes for an entity by gathering reasoning paths from similar entities in the KB.
arXiv Detail & Related papers (2020-10-07T17:48:12Z) - On Embeddings in Relational Databases [11.52782249184251]
We address the problem of learning a distributed representation of entities in a relational database using a low-dimensional embedding.
Recent methods for learning embedding constitute of a naive approach to consider complete denormalization of the database by relationalizing the full join of all tables and representing as a knowledge graph.
In this paper we demonstrate; a better methodology for learning representations by exploiting the underlying semantics of columns in a table while using the relation joins and the latent inter-row relationships.
arXiv Detail & Related papers (2020-05-13T17:21:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.