Revisiting Deep Learning Models for Tabular Data
- URL: http://arxiv.org/abs/2106.11959v5
- Date: Thu, 26 Oct 2023 12:00:03 GMT
- Title: Revisiting Deep Learning Models for Tabular Data
- Authors: Yury Gorishniy, Ivan Rubachev, Valentin Khrulkov, Artem Babenko
- Abstract summary: It is unclear for both researchers and practitioners what models perform best.
The first one is a ResNet-like architecture which turns out to be a strong baseline that is often missing in prior works.
The second model is our simple adaptation of the Transformer architecture for tabular data, which outperforms other solutions on most tasks.
- Score: 40.67427600770095
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The existing literature on deep learning for tabular data proposes a wide
range of novel architectures and reports competitive results on various
datasets. However, the proposed models are usually not properly compared to
each other and existing works often use different benchmarks and experiment
protocols. As a result, it is unclear for both researchers and practitioners
what models perform best. Additionally, the field still lacks effective
baselines, that is, the easy-to-use models that provide competitive performance
across different problems.
In this work, we perform an overview of the main families of DL architectures
for tabular data and raise the bar of baselines in tabular DL by identifying
two simple and powerful deep architectures. The first one is a ResNet-like
architecture which turns out to be a strong baseline that is often missing in
prior works. The second model is our simple adaptation of the Transformer
architecture for tabular data, which outperforms other solutions on most tasks.
Both models are compared to many existing architectures on a diverse set of
tasks under the same training and tuning protocols. We also compare the best DL
models with Gradient Boosted Decision Trees and conclude that there is still no
universally superior solution.
Related papers
- GFS: Graph-based Feature Synthesis for Prediction over Relational
Databases [39.975491511390985]
We propose a novel framework called Graph-based Feature Synthesis (GFS)
GFS formulates relational database as a heterogeneous graph database.
In an experiment over four real-world multi-table relational databases, GFS outperforms previous methods designed for relational databases.
arXiv Detail & Related papers (2023-12-04T16:54:40Z) - TabR: Tabular Deep Learning Meets Nearest Neighbors in 2023 [33.70333110327871]
We present TabR -- essentially, a feed-forward network with a custom k-Nearest-Neighbors-like component in the middle.
On a set of public benchmarks with datasets up to several million objects, TabR demonstrates the best average performance.
In addition to the much higher performance, TabR is simple and significantly more efficient.
arXiv Detail & Related papers (2023-07-26T17:58:07Z) - Federated Learning with Heterogeneous Architectures using Graph
HyperNetworks [154.60662664160333]
We propose a new FL framework that accommodates heterogeneous client architecture by adopting a graph hypernetwork for parameter sharing.
Unlike existing solutions, our framework does not limit the clients to share the same architecture type, makes no use of external data and does not require clients to disclose their model architecture.
arXiv Detail & Related papers (2022-01-20T21:36:25Z) - Data Summarization via Bilevel Optimization [48.89977988203108]
A simple yet powerful approach is to operate on small subsets of data.
In this work, we propose a generic coreset framework that formulates the coreset selection as a cardinality-constrained bilevel optimization problem.
arXiv Detail & Related papers (2021-09-26T09:08:38Z) - AutoRC: Improving BERT Based Relation Classification Models via
Architecture Search [50.349407334562045]
BERT based relation classification (RC) models have achieved significant improvements over the traditional deep learning models.
No consensus can be reached on what is the optimal architecture.
We design a comprehensive search space for BERT based RC models and employ neural architecture search (NAS) method to automatically discover the design choices.
arXiv Detail & Related papers (2020-09-22T16:55:49Z) - CorDEL: A Contrastive Deep Learning Approach for Entity Linkage [70.82533554253335]
Entity linkage (EL) is a critical problem in data cleaning and integration.
With the ever-increasing growth of new data, deep learning (DL) based approaches have been proposed to alleviate the high cost of EL associated with the traditional models.
We argue that the twin-network architecture is sub-optimal to EL, leading to inherent drawbacks of existing models.
arXiv Detail & Related papers (2020-09-15T16:33:05Z) - NASE: Learning Knowledge Graph Embedding for Link Prediction via Neural
Architecture Search [9.634626241415916]
Link prediction is the task of predicting missing connections between entities in the knowledge graph (KG)
Previous work has tried to use Automated Machine Learning (AutoML) to search for the best model for a given dataset.
We propose a novel Neural Architecture Search (NAS) framework for the link prediction task.
arXiv Detail & Related papers (2020-08-18T03:34:09Z) - Abstractive and mixed summarization for long-single documents [2.792030485253753]
This paper uses scientific papers as the dataset on which different models are trained.
In this work, six different models are compared, two with an RNN architecture, one with a CNN architecture, two with a Transformer architecture and one with a Transformer architecture combined with reinforcement learning.
arXiv Detail & Related papers (2020-07-03T19:30:28Z) - Deep transfer learning for improving single-EEG arousal detection [63.52264764099532]
Two datasets do not contain exactly the same setup leading to degraded performance in single-EEG models.
We train a baseline model and replace the first two layers to prepare the architecture for single-channel electroencephalography data.
Using a fine-tuning strategy, our model yields similar performance to the baseline model and was significantly better than a comparable single-channel model.
arXiv Detail & Related papers (2020-04-10T16:51:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.