Related papers: The Duck's Brain: Training and Inference of Neural Networks in Modern Database Engines

The Duck's Brain: Training and Inference of Neural Networks in Modern Database Engines

URL: http://arxiv.org/abs/2312.17355v1
Date: Thu, 28 Dec 2023 20:45:06 GMT
Title: The Duck's Brain: Training and Inference of Neural Networks in Modern Database Engines
Authors: Maximilian E. Sch\"ule and Thomas Neumann and Alfons Kemper
Abstract summary: We show how to transform data into a relational representation for training neural networks insql. The evaluation in terms of runtime and memory consumption proves the suitability of modern database systems for matrix algebra.
Score: 9.450046371705927
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Although database systems perform well in data access and manipulation, their relational model hinders data scientists from formulating machine learning algorithms in SQL. Nevertheless, we argue that modern database systems perform well for machine learning algorithms expressed in relational algebra. To overcome the barrier of the relational model, this paper shows how to transform data into a relational representation for training neural networks in SQL: We first describe building blocks for data transformation, model training and inference in SQL-92 and their counterparts using an extended array data type. Then, we compare the implementation for model training and inference using array data types to the one using a relational representation in SQL-92 only. The evaluation in terms of runtime and memory consumption proves the suitability of modern database systems for matrix algebra, although specialised array data types perform better than matrices in relational representation.

Related papers

Transformers Meet Relational Databases [0.0]
Transformer models have continuously expanded into all machine learning domains convertible to the underlying sequence-to-sequence representation. We introduce a modular neural message-passing scheme that closely adheres to the formal relational model. Our results demonstrate a superior performance of this newly proposed class of neural architectures.
arXiv Detail & Related papers (2024-12-06T17:48:43Z)
In-Database Data Imputation [0.6157028677798809]
Missing data is a widespread problem in many domains, creating challenges in data analysis and decision making. Traditional techniques for dealing with missing data, such as excluding incomplete records or imputing simple estimates, are computationally efficient but may introduce bias and disrupt variable relationships. Model-based imputation techniques offer a more robust solution that preserves the variability and relationships in the data, but they demand significantly more computation time. This work enables efficient, high-quality, and scalable data imputation within a database system using the widely used MICE method.
arXiv Detail & Related papers (2024-01-07T01:57:41Z)
Relational Deep Learning: Graph Representation Learning on Relational Databases [69.7008152388055]
We introduce an end-to-end representation approach to learn on data laid out across multiple tables. Message Passing Graph Neural Networks can then automatically learn across the graph to extract representations that leverage all data input.
arXiv Detail & Related papers (2023-12-07T18:51:41Z)
GFS: Graph-based Feature Synthesis for Prediction over Relational Databases [39.975491511390985]
We propose a novel framework called Graph-based Feature Synthesis (GFS) GFS formulates relational database as a heterogeneous graph database. In an experiment over four real-world multi-table relational databases, GFS outperforms previous methods designed for relational databases.
arXiv Detail & Related papers (2023-12-04T16:54:40Z)
Retrieval-Based Transformer for Table Augmentation [14.460363647772745]
We introduce a novel approach toward automatic data wrangling. We aim to address table augmentation tasks, including row/column population and data imputation. Our model consistently and substantially outperforms both supervised statistical methods and the current state-of-the-art transformer-based models.
arXiv Detail & Related papers (2023-06-20T18:51:21Z)
Neural RELAGGS [7.690774882108066]
Multi-relational databases are the basis of most consolidated data collections in science and industry today. propositionalization algorithms transform multi-relational databases into propositional data sets. We propose a new neural network based algorithm in the spirit of RELAGGS that employs trainable composite aggregate functions.
arXiv Detail & Related papers (2022-11-04T10:42:21Z)
Learning to Synthesize Data for Semantic Parsing [57.190817162674875]
We propose a generative model which models the composition of programs and maps a program to an utterance. Due to the simplicity of PCFG and pre-trained BART, our generative model can be efficiently learned from existing data at hand. We evaluate our method in both in-domain and out-of-domain settings of text-to-Query parsing on the standard benchmarks of GeoQuery and Spider.
arXiv Detail & Related papers (2021-04-12T21:24:02Z)
Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training [86.91380874390778]
We present Generation-Augmented Pre-training (GAP), that jointly learns representations of natural language utterances and table schemas by leveraging generation models to generate pre-train data. Based on experimental results, neural semantics that leverage GAP MODEL obtain new state-of-the-art results on both SPIDER and CRITERIA-TO-generative benchmarks.
arXiv Detail & Related papers (2020-12-18T15:53:50Z)
Model-Based Deep Learning [155.063817656602]
Signal processing, communications, and control have traditionally relied on classical statistical modeling techniques. Deep neural networks (DNNs) use generic architectures which learn to operate from data, and demonstrate excellent performance. We are interested in hybrid techniques that combine principled mathematical models with data-driven systems to benefit from the advantages of both approaches.
arXiv Detail & Related papers (2020-12-15T16:29:49Z)
Relation-Guided Representation Learning [53.60351496449232]
We propose a new representation learning method that explicitly models and leverages sample relations. Our framework well preserves the relations between samples. By seeking to embed samples into subspace, we show that our method can address the large-scale and out-of-sample problem.
arXiv Detail & Related papers (2020-07-11T10:57:45Z)
Supervised Learning on Relational Databases with Graph Neural Networks [10.279748604797911]
Training machine learning models on data stored in relational databases requires significant data extraction and feature engineering efforts. We introduce a method that uses Graph Neural Networks to overcome these challenges. Our proposed method outperforms state-of-the-art automatic feature engineering methods on two out of three datasets.
arXiv Detail & Related papers (2020-02-06T00:57:39Z)
Multi-layer Optimizations for End-to-End Data Analytics [71.05611866288196]
We introduce Iterative Functional Aggregate Queries (IFAQ), a framework that realizes an alternative approach. IFAQ treats the feature extraction query and the learning task as one program given in the IFAQ's domain-specific language. We show that a Scala implementation of IFAQ can outperform mlpack, Scikit, and specialization by several orders of magnitude for linear regression and regression tree models over several relational datasets.
arXiv Detail & Related papers (2020-01-10T16:14:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.