Related papers: Role of Databases in GenAI Applications

Role of Databases in GenAI Applications

URL: http://arxiv.org/abs/2503.04847v2
Date: Fri, 11 Apr 2025 17:07:51 GMT
Title: Role of Databases in GenAI Applications
Authors: Santosh Bhupathi,
Abstract summary: Generative AI (GenAI) is transforming industries by enabling intelligent content generation, automation, and decision-making.<n>This paper explores the critical role of databases in GenAI, emphasizing the importance of choosing the right database architecture.<n>It categorizes database roles into conversational context (key-value/document databases), situational context (relational databases/data lakehouses), and semantic context (vector databases)
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Generative AI (GenAI) is transforming industries by enabling intelligent content generation, automation, and decision-making. However, the effectiveness of GenAI applications depends significantly on efficient data storage, retrieval, and contextual augmentation. This paper explores the critical role of databases in GenAI workflows, emphasizing the importance of choosing the right database architecture to optimize performance, accuracy, and scalability. It categorizes database roles into conversational context (key-value/document databases), situational context (relational databases/data lakehouses), and semantic context (vector databases) each serving a distinct function in enriching AI-generated responses. Additionally, the paper highlights real-time query processing, vector search for semantic retrieval, and the impact of database selection on model efficiency and scalability. By leveraging a multi-database approach, GenAI applications can achieve more context-aware, personalized, and high-performing AI-driven solutions.

Related papers

LLM and Agent-Driven Data Analysis: A Systematic Approach for Enterprise Applications and System-level Deployment [17.572976426351318]
Generative AI and Agent technologies are transforming enterprise data management and analytics.<n>Traditional database applications and system deployment are fundamentally impacted by AI-driven tools.<n>Data security and compliance are top priorities for organizations adopting AI technologies.
arXiv Detail & Related papers (2025-11-21T07:16:31Z)
LLM-based Multi-Agent Blackboard System for Information Discovery in Data Science [69.1690891731311]
We propose a novel multi-agent communication paradigm inspired by the blackboard architecture for traditional AI models.<n>In this framework, a central agent posts requests to a shared blackboard, and autonomous subordinate agents respond based on their capabilities.<n>We evaluate our method on three benchmarks that require explicit data discovery.
arXiv Detail & Related papers (2025-09-30T22:34:23Z)
Autonomous Data Agents: A New Opportunity for Smart Data [50.02229219403014]
Report argues that DataAgents represent a paradigm shift toward autonomous data-to-knowledge systems.<n>DataAgents transform complex and unstructured data into coherent and actionable knowledge.<n>We first examine why the convergence of agentic AI and data-to-knowledge systems has emerged as a critical trend.
arXiv Detail & Related papers (2025-09-23T06:46:41Z)
RAISE: Reasoning Agent for Interactive SQL Exploration [47.77323087050061]
We propose a novel framework that unifies schema linking, query generation, and iterative refinement within a single, end-to-end component.<n>Our method emulates how humans answer questions when working with unfamiliar databases.
arXiv Detail & Related papers (2025-06-02T03:07:08Z)
AnDB: Breaking Boundaries with an AI-Native Database for Universal Semantic Analysis [11.419119182421964]
AnDB is an AI-native database that supports traditional O workloads and AI-driven tasks.<n>AnDB allows users to perform semantic queries using intuitive-like statements without requiring AI expertise.<n>AnDB future-proofs data management infrastructure, empowering users to effectively and efficiently harness the full potential of all kinds of data without starting from scratch.
arXiv Detail & Related papers (2025-02-19T15:15:59Z)
Top Ten Challenges Towards Agentic Neural Graph Databases [56.92578700681306]
Graph databases (GDBs) like Neo4j and TigerGraph excel at handling interconnected data but lack advanced inference capabilities.<n>This paper introduces Agentic Neural Graph Databases (Agentic NGDBs), which extend NGDBs with three core functionalities.
arXiv Detail & Related papers (2025-01-24T04:06:50Z)
TARGA: Targeted Synthetic Data Generation for Practical Reasoning over Structured Data [9.390415313514762]
TARGA is a framework that generates high-relevance synthetic data without manual annotation. It substantially outperforms existing non-fine-tuned methods that utilize close-sourced model. It exhibits superior sample efficiency, robustness, and generalization capabilities under non-I.I.D. settings.
arXiv Detail & Related papers (2024-12-27T09:16:39Z)
Data-Juicer 2.0: Cloud-Scale Adaptive Data Processing for and with Foundation Models [64.28420991770382]
Data-Juicer 2.0 is a data processing system backed by data processing operators spanning text, image, video, and audio modalities.<n>It supports more critical tasks including data analysis, annotation, and foundation model post-training.<n>It has been widely adopted in diverse research fields and real-world products such as Alibaba Cloud PAI.
arXiv Detail & Related papers (2024-12-23T08:29:57Z)
A Collaborative Multi-Agent Approach to Retrieval-Augmented Generation Across Diverse Data [0.0]
Retrieval-Augmented Generation (RAG) enhances Large Language Models (LLMs)<n>Traditional RAG systems typically use a single-agent architecture to handle query generation, data retrieval, and response synthesis.<n>This paper proposes a multi-agent RAG system to address these limitations.
arXiv Detail & Related papers (2024-12-08T07:18:19Z)
DataLab: A Unified Platform for LLM-Powered Business Intelligence [41.21303493090702]
We introduce DataLab, a unified BI platform that integrates a one-stop LLM-based agent framework with an augmented computational notebook interface.<n>DataLab supports a wide range of BI tasks for different data roles by combining LLM assistance with user customization within a single environment.<n>Extensive experiments demonstrate that DataLab achieves state-of-the-art performance on various BI tasks across popular research benchmarks.
arXiv Detail & Related papers (2024-12-03T06:47:15Z)
BabelBench: An Omni Benchmark for Code-Driven Analysis of Multimodal and Multistructured Data [61.936320820180875]
Large language models (LLMs) have become increasingly pivotal across various domains. BabelBench is an innovative benchmark framework that evaluates the proficiency of LLMs in managing multimodal multistructured data with code execution. Our experimental findings on BabelBench indicate that even cutting-edge models like ChatGPT 4 exhibit substantial room for improvement.
arXiv Detail & Related papers (2024-10-01T15:11:24Z)
NeurDB: On the Design and Implementation of an AI-powered Autonomous Database [27.13518136879994]
This paper introduces NeurDB, an AI-powered autonomous database.<n>NeurDB deepens the fusion of AI and databases with adaptability to data and workload drift.<n> Empirical evaluations demonstrate that NeurDB substantially outperforms existing solutions in managing AI analytics tasks.
arXiv Detail & Related papers (2024-08-06T07:48:51Z)
Learning towards Selective Data Augmentation for Dialogue Generation [52.540330534137794]
We argue that not all cases are beneficial for augmentation task, and the cases suitable for augmentation should obey the following two attributes. We propose a Selective Data Augmentation framework (SDA) for the response generation task.
arXiv Detail & Related papers (2023-03-17T01:26:39Z)
Nemo: Guiding and Contextualizing Weak Supervision for Interactive Data Programming [77.38174112525168]
We present Nemo, an end-to-end interactive Supervision system that improves overall productivity of WS learning pipeline by an average 20% (and up to 47% in one task) compared to the prevailing WS supervision approach.
arXiv Detail & Related papers (2022-03-02T19:57:32Z)
Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing [52.24507547010127]
Cross-domain context-dependent semantic parsing is a new focus of research. We present a dynamic graph framework that effectively modelling contextual utterances, tokens, database schemas, and their complicated interaction as the conversation proceeds. The proposed framework outperforms all existing models by large margins, achieving new state-of-the-art performance on two large-scale benchmarks.
arXiv Detail & Related papers (2021-01-05T18:11:29Z)
KILT: a Benchmark for Knowledge Intensive Language Tasks [102.33046195554886]
We present a benchmark for knowledge-intensive language tasks (KILT) All tasks in KILT are grounded in the same snapshot of Wikipedia. We find that a shared dense vector index coupled with a seq2seq model is a strong baseline.
arXiv Detail & Related papers (2020-09-04T15:32:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.