Related papers: A Scalable Space-efficient In-database Interpretability Framework for Embedding-based Semantic SQL Queries

A Scalable Space-efficient In-database Interpretability Framework for Embedding-based Semantic SQL Queries

URL: http://arxiv.org/abs/2302.12178v2
Date: Fri, 24 Feb 2023 17:22:52 GMT
Title: A Scalable Space-efficient In-database Interpretability Framework for Embedding-based Semantic SQL Queries
Authors: Prabhakar Kudva, Rajesh Bordawekar, Apoorva Nitsure
Abstract summary: We introduce a new co-occurrence based interpretability approach to capture relationships between relational entities. Our approach provides both query-agnostic (global) and query-specific (local) interpretabilities.
Score: 3.0938904602244346
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: AI-Powered database (AI-DB) is a novel relational database system that uses a self-supervised neural network, database embedding, to enable semantic SQL queries on relational tables. In this paper, we describe an architecture and implementation of in-database interpretability infrastructure designed to provide simple, transparent, and relatable insights into ranked results of semantic SQL queries supported by AI-DB. We introduce a new co-occurrence based interpretability approach to capture relationships between relational entities and describe a space-efficient probabilistic Sketch implementation to store and process co-occurrence counts. Our approach provides both query-agnostic (global) and query-specific (local) interpretabilities. Experimental evaluation demonstrate that our in-database probabilistic approach provides the same interpretability quality as the precise space-inefficient approach, while providing scalable and space efficient runtime behavior (up to 8X space savings), without any user intervention.

Related papers

AnDB: Breaking Boundaries with an AI-Native Database for Universal Semantic Analysis [11.419119182421964]
AnDB is an AI-native database that supports traditional O workloads and AI-driven tasks. AnDB allows users to perform semantic queries using intuitive-like statements without requiring AI expertise. AnDB future-proofs data management infrastructure, empowering users to effectively and efficiently harness the full potential of all kinds of data without starting from scratch.
arXiv Detail & Related papers (2025-02-19T15:15:59Z)
Bridging the Gap: Enabling Natural Language Queries for NoSQL Databases through Text-to-NoSQL Translation [25.638927795540454]
We introduce the Text-to-No task, which aims to convert natural language queries into accessible queries. To promote research in this area, we released a large-scale and open-source dataset for this task, named TEND (short interfaces for Text-to-No dataset) We also designed a SLM (Small Language Model)-assisted and RAG (Retrieval-augmented Generation)-assisted multi-step framework called SMART, which is specifically designed for Text-to-No conversion.
arXiv Detail & Related papers (2025-02-16T17:01:48Z)
Text-to-SQL based on Large Language Models and Database Keyword Search [0.0]
This paper proposes a strategy to compile Natural Language (NL) questions intosql queries. The strategy incorporates a dynamic few-shot examples strategy and leverages the services provided by a database keyword search (KwS) platform. Experiments show that the strategy achieves an accuracy on the real-world relational database that surpasses state-of-the-art approaches.
arXiv Detail & Related papers (2025-01-23T12:03:29Z)
Interactive-T2S: Multi-Turn Interactions for Text-to-SQL with Large Language Models [9.914489049993495]
We introduce Interactive-T2S, a framework that generatessql queries through direct interactions with databases. We have developed detailed exemplars to demonstrate the step-wise reasoning processes within our framework. Our experiments on the BIRD-Dev dataset, employing a setting without oracle knowledge, reveal that our method achieves state-of-the-art results with only two exemplars.
arXiv Detail & Related papers (2024-08-09T07:43:21Z)
UQE: A Query Engine for Unstructured Databases [71.49289088592842]
We investigate the potential of Large Language Models to enable unstructured data analytics. We propose a new Universal Query Engine (UQE) that directly interrogates and draws insights from unstructured data collections.
arXiv Detail & Related papers (2024-06-23T06:58:55Z)
Improving Text-to-SQL Semantic Parsing with Fine-grained Query Understanding [84.04706075621013]
We present a general-purpose, modular neural semantic parsing framework based on token-level fine-grained query understanding. Our framework consists of three modules: named entity recognizer (NER), neural entity linker (NEL) and neural entity linker (NSP)
arXiv Detail & Related papers (2022-09-28T21:00:30Z)
SUN: Exploring Intrinsic Uncertainties in Text-to-SQL Parsers [61.48159785138462]
This paper aims to improve the performance of text-to-dependence by exploring the intrinsic uncertainties in the neural network based approaches (called SUN) Extensive experiments on five benchmark datasets demonstrate that our method significantly outperforms competitors and achieves new state-of-the-art results.
arXiv Detail & Related papers (2022-09-14T06:27:51Z)
Proton: Probing Schema Linking Information from Pre-trained Language Models for Text-to-SQL Parsing [66.55478402233399]
We propose a framework to elicit relational structures via a probing procedure based on Poincar'e distance metric. Compared with commonly-used rule-based methods for schema linking, we found that probing relations can robustly capture semantic correspondences. Our framework sets new state-of-the-art performance on three benchmarks.
arXiv Detail & Related papers (2022-06-28T14:05:25Z)
BERT Meets Relational DB: Contextual Representations of Relational Databases [4.029818252558553]
We address the problem of learning low dimension representation of entities on relational databases consisting of multiple tables. We look into ways of using these attention-based model to learn embeddings for entities in the relational database.
arXiv Detail & Related papers (2021-04-30T11:23:26Z)
Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing [52.24507547010127]
Cross-domain context-dependent semantic parsing is a new focus of research. We present a dynamic graph framework that effectively modelling contextual utterances, tokens, database schemas, and their complicated interaction as the conversation proceeds. The proposed framework outperforms all existing models by large margins, achieving new state-of-the-art performance on two large-scale benchmarks.
arXiv Detail & Related papers (2021-01-05T18:11:29Z)
Complex Coordinate-Based Meta-Analysis with Probabilistic Programming [0.0]
Coordinate-based meta-analysis (CBMA) databases are built by automatically extracting both coordinates of reported peak activations and term associations. We show how recent lifted query processing algorithms make it possible to scale to the size of large neuroimaging data. We demonstrate results for two-term conjunctive queries, both on simulated meta-analysis databases and on the widely-used Neurosynth database.
arXiv Detail & Related papers (2020-12-02T16:16:26Z)
Probabilistic Case-based Reasoning for Open-World Knowledge Graph Completion [59.549664231655726]
A case-based reasoning (CBR) system solves a new problem by retrieving cases' that are similar to the given problem. In this paper, we demonstrate that such a system is achievable for reasoning in knowledge-bases (KBs) Our approach predicts attributes for an entity by gathering reasoning paths from similar entities in the KB.
arXiv Detail & Related papers (2020-10-07T17:48:12Z)
On Embeddings in Relational Databases [11.52782249184251]
We address the problem of learning a distributed representation of entities in a relational database using a low-dimensional embedding. Recent methods for learning embedding constitute of a naive approach to consider complete denormalization of the database by relationalizing the full join of all tables and representing as a knowledge graph. In this paper we demonstrate; a better methodology for learning representations by exploiting the underlying semantics of columns in a table while using the relation joins and the latent inter-row relationships.
arXiv Detail & Related papers (2020-05-13T17:21:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.