Deep Neural Networks and Tabular Data: A Survey
- URL: http://arxiv.org/abs/2110.01889v1
- Date: Tue, 5 Oct 2021 09:22:39 GMT
- Title: Deep Neural Networks and Tabular Data: A Survey
- Authors: Vadim Borisov, Tobias Leemann, Kathrin Se{\ss}ler, Johannes Haug,
Martin Pawelczyk, Gjergji Kasneci
- Abstract summary: This work provides an overview of state-of-the-art deep learning methods for tabular data.
We start by categorizing them into three groups: data transformations, specialized architectures, and regularization models.
We then provide a comprehensive overview of the main approaches in each group.
- Score: 6.940394595795544
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Heterogeneous tabular data are the most commonly used form of data and are
essential for numerous critical and computationally demanding applications. On
homogeneous data sets, deep neural networks have repeatedly shown excellent
performance and have therefore been widely adopted. However, their application
to modeling tabular data (inference or generation) remains highly challenging.
This work provides an overview of state-of-the-art deep learning methods for
tabular data. We start by categorizing them into three groups: data
transformations, specialized architectures, and regularization models. We then
provide a comprehensive overview of the main approaches in each group. A
discussion of deep learning approaches for generating tabular data is
complemented by strategies for explaining deep models on tabular data. Our
primary contribution is to address the main research streams and existing
methodologies in this area, while highlighting relevant challenges and open
research questions. To the best of our knowledge, this is the first in-depth
look at deep learning approaches for tabular data. This work can serve as a
valuable starting point and guide for researchers and practitioners interested
in deep learning with tabular data.
Related papers
- A Closer Look at Deep Learning on Tabular Data [52.50778536274327]
Tabular data is prevalent across various domains in machine learning.
Deep Neural Network (DNN)-based methods have shown promising performance comparable to tree-based ones.
arXiv Detail & Related papers (2024-07-01T04:24:07Z) - Relational Deep Learning: Graph Representation Learning on Relational
Databases [69.7008152388055]
We introduce an end-to-end representation approach to learn on data laid out across multiple tables.
Message Passing Graph Neural Networks can then automatically learn across the graph to extract representations that leverage all data input.
arXiv Detail & Related papers (2023-12-07T18:51:41Z) - DataFinder: Scientific Dataset Recommendation from Natural Language
Descriptions [100.52917027038369]
We operationalize the task of recommending datasets given a short natural language description.
To facilitate this task, we build the DataFinder dataset which consists of a larger automatically-constructed training set and a smaller expert-annotated evaluation set.
This system, trained on the DataFinder dataset, finds more relevant search results than existing third-party dataset search engines.
arXiv Detail & Related papers (2023-05-26T05:22:36Z) - Graph Neural Network contextual embedding for Deep Learning on Tabular
Data [0.45880283710344055]
Deep Learning (DL) has constituted a major breakthrough for AI in fields related to human skills like natural language processing.
This paper presents a novel DL model using Graph Neural Network (GNN) more specifically Interaction Network (IN)
Its results outperform those of a recently published survey with DL benchmark based on five public datasets, also achieving competitive results when compared to boosted-tree solutions.
arXiv Detail & Related papers (2023-03-11T17:13:24Z) - Deep networks for system identification: a Survey [56.34005280792013]
System identification learns mathematical descriptions of dynamic systems from input-output data.
Main aim of the identified model is to predict new data from previous observations.
We discuss architectures commonly adopted in the literature, like feedforward, convolutional, and recurrent networks.
arXiv Detail & Related papers (2023-01-30T12:38:31Z) - Are Deep Image Embedding Clustering Methods Effective for Heterogeneous
Tabular Data? [0.0]
This paper performs one of the first studies on deep embedding clustering of seven data sets using six state-of-the-art baseline methods proposed for image data sets.
Traditional clustering of tabular data ranks second out of eight methods and is superior to most deep embedding clustering baselines.
arXiv Detail & Related papers (2022-12-28T22:29:10Z) - Why do tree-based models still outperform deep learning on tabular data? [0.0]
We show that tree-based models remain state-of-the-art on medium-sized data.
We conduct an empirical investigation into the differing inductive biases of tree-based models and Neural Networks (NNs)
arXiv Detail & Related papers (2022-07-18T08:36:08Z) - Transfer Learning with Deep Tabular Models [66.67017691983182]
We show that upstream data gives tabular neural networks a decisive advantage over GBDT models.
We propose a realistic medical diagnosis benchmark for tabular transfer learning.
We propose a pseudo-feature method for cases where the upstream and downstream feature sets differ.
arXiv Detail & Related papers (2022-06-30T14:24:32Z) - Relation-Guided Representation Learning [53.60351496449232]
We propose a new representation learning method that explicitly models and leverages sample relations.
Our framework well preserves the relations between samples.
By seeking to embed samples into subspace, we show that our method can address the large-scale and out-of-sample problem.
arXiv Detail & Related papers (2020-07-11T10:57:45Z) - A Survey of Deep Learning for Scientific Discovery [13.372738220280317]
We have seen fundamental breakthroughs in core problems in machine learning, largely driven by advances in deep neural networks.
The amount of data collected in a wide array of scientific domains is dramatically increasing in both size and complexity.
This suggests many exciting opportunities for deep learning applications in scientific settings.
arXiv Detail & Related papers (2020-03-26T06:16:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.