Tackling prediction tasks in relational databases with LLMs
- URL: http://arxiv.org/abs/2411.11829v1
- Date: Mon, 18 Nov 2024 18:48:13 GMT
- Title: Tackling prediction tasks in relational databases with LLMs
- Authors: Marek Wydmuch, Łukasz Borchmann, Filip Graliński,
- Abstract summary: This work addresses the notion that large language models (LLMs) cannot yield satisfactory results on relational databases.
Using the recently introduced RelBench benchmark, we demonstrate that even a straightforward application of LLMs achieves competitive performance on these tasks.
- Score: 1.8434042562191815
- License:
- Abstract: Though large language models (LLMs) have demonstrated exceptional performance across numerous problems, their application to predictive tasks in relational databases remains largely unexplored. In this work, we address the notion that LLMs cannot yield satisfactory results on relational databases due to their interconnected tables, complex relationships, and heterogeneous data types. Using the recently introduced RelBench benchmark, we demonstrate that even a straightforward application of LLMs achieves competitive performance on these tasks. These findings establish LLMs as a promising new baseline for ML on relational databases and encourage further research in this direction.
Related papers
- Making LLMs Work for Enterprise Data Tasks [4.233865241818131]
Large language models (LLMs) know little about enterprise database tables in the private data ecosystem.
As LLMs' performance is tied to their training data, a crucial question is how useful they can be in improving enterprise database management and analysis tasks.
arXiv Detail & Related papers (2024-07-22T21:16:59Z) - Enhancing Temporal Understanding in LLMs for Semi-structured Tables [50.59009084277447]
We conduct a comprehensive analysis of temporal datasets to pinpoint the specific limitations of large language models (LLMs)
Our investigation leads to enhancements in TempTabQA, a dataset specifically designed for temporal temporal question answering.
We introduce a novel approach, C.L.E.A.R. to strengthen LLM capabilities in this domain.
arXiv Detail & Related papers (2024-07-22T20:13:10Z) - Relational Database Augmented Large Language Model [59.38841050766026]
Large language models (LLMs) excel in many natural language processing (NLP) tasks.
They can only incorporate new knowledge through training or supervised fine-tuning processes.
This precise, up-to-date, and private information is typically stored in relational databases.
arXiv Detail & Related papers (2024-07-21T06:19:10Z) - SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts.
We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM.
We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z) - The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective [53.48484062444108]
We find that the development of models and data is not two separate paths but rather interconnected.
On the one hand, vaster and higher-quality data contribute to better performance of MLLMs; on the other hand, MLLMs can facilitate the development of data.
To promote the data-model co-development for MLLM community, we systematically review existing works related to MLLMs from the data-model co-development perspective.
arXiv Detail & Related papers (2024-07-11T15:08:11Z) - Lucy: Think and Reason to Solve Text-to-SQL [12.52968634440807]
Large Language Models (LLMs) have made significant progress in assisting users to query databases in natural language.
LLMs provide state-of-the-art results on many standard benchmarks, but their performance significantly drops when applied to large enterprise databases.
We propose a new solution that combines the power of LLMs in understanding questions with automated reasoning techniques to handle complex database constraints.
arXiv Detail & Related papers (2024-07-06T18:56:42Z) - Efficient Prompting for LLM-based Generative Internet of Things [88.84327500311464]
Large language models (LLMs) have demonstrated remarkable capacities on various tasks, and integrating the capacities of LLMs into the Internet of Things (IoT) applications has drawn much research attention recently.
Due to security concerns, many institutions avoid accessing state-of-the-art commercial LLM services, requiring the deployment and utilization of open-source LLMs in a local network setting.
We propose a LLM-based Generative IoT (GIoT) system deployed in the local network setting in this study.
arXiv Detail & Related papers (2024-06-14T19:24:00Z) - ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models [46.07900122810749]
Large language models (LLMs) have achieved unprecedented performances in various applications, yet evaluating them is still challenging.
We contend that utilizing existing relational databases is a promising approach for constructing benchmarks.
We propose ERBench, which uses these integrity constraints to convert any database into an LLM benchmark.
arXiv Detail & Related papers (2024-03-08T12:42:36Z) - TAP4LLM: Table Provider on Sampling, Augmenting, and Packing Semi-structured Data for Large Language Model Reasoning [55.33939289989238]
We propose TAP4LLM as a versatile pre-processor suite for leveraging large language models (LLMs) in table-based tasks effectively.
It covers several distinct components: (1) table sampling to decompose large tables into manageable sub-tables based on query semantics, (2) table augmentation to enhance tables with additional knowledge from external sources or models, and (3) table packing & serialization to convert tables into various formats suitable for LLMs' understanding.
arXiv Detail & Related papers (2023-12-14T15:37:04Z) - Querying Large Language Models with SQL [16.383179496709737]
In many use-cases, information is stored in text but not available in structured data.
With the rise of pre-trained Large Language Models (LLMs), there is now an effective solution to store and use information extracted from massive corpora of text documents.
We present Galois, a prototype based on a traditional database architecture, but with new physical operators for querying the underlying LLM.
arXiv Detail & Related papers (2023-04-02T06:58:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.