Related papers: Revisiting Deep Learning Models for Tabular Data

Revisiting Deep Learning Models for Tabular Data

URL: http://arxiv.org/abs/2106.11959v5
Date: Thu, 26 Oct 2023 12:00:03 GMT
Title: Revisiting Deep Learning Models for Tabular Data
Authors: Yury Gorishniy, Ivan Rubachev, Valentin Khrulkov, Artem Babenko
Abstract summary: It is unclear for both researchers and practitioners what models perform best. The first one is a ResNet-like architecture which turns out to be a strong baseline that is often missing in prior works. The second model is our simple adaptation of the Transformer architecture for tabular data, which outperforms other solutions on most tasks.
Score: 40.67427600770095
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The existing literature on deep learning for tabular data proposes a wide range of novel architectures and reports competitive results on various datasets. However, the proposed models are usually not properly compared to each other and existing works often use different benchmarks and experiment protocols. As a result, it is unclear for both researchers and practitioners what models perform best. Additionally, the field still lacks effective baselines, that is, the easy-to-use models that provide competitive performance across different problems. In this work, we perform an overview of the main families of DL architectures for tabular data and raise the bar of baselines in tabular DL by identifying two simple and powerful deep architectures. The first one is a ResNet-like architecture which turns out to be a strong baseline that is often missing in prior works. The second model is our simple adaptation of the Transformer architecture for tabular data, which outperforms other solutions on most tasks. Both models are compared to many existing architectures on a diverse set of tasks under the same training and tuning protocols. We also compare the best DL models with Gradient Boosted Decision Trees and conclude that there is still no universally superior solution.

Related papers

Towards Better Understanding Table Instruction Tuning: Decoupling the Effects from Data versus Models [62.47618742274461]
We fine-tune base models from the Mistral, OLMo, and Phi families on existing public training datasets. Our replication achieves performance on par with or surpassing existing table LLMs. We decouple the contributions of training data and the base model, providing insight into their individual impacts.
arXiv Detail & Related papers (2025-01-24T18:50:26Z)
GFS: Graph-based Feature Synthesis for Prediction over Relational Databases [39.975491511390985]
We propose a novel framework called Graph-based Feature Synthesis (GFS) GFS formulates relational database as a heterogeneous graph database. In an experiment over four real-world multi-table relational databases, GFS outperforms previous methods designed for relational databases.
arXiv Detail & Related papers (2023-12-04T16:54:40Z)
TabR: Tabular Deep Learning Meets Nearest Neighbors in 2023 [33.70333110327871]
We present TabR -- essentially, a feed-forward network with a custom k-Nearest-Neighbors-like component in the middle. On a set of public benchmarks with datasets up to several million objects, TabR demonstrates the best average performance. In addition to the much higher performance, TabR is simple and significantly more efficient.
arXiv Detail & Related papers (2023-07-26T17:58:07Z)
Federated Learning with Heterogeneous Architectures using Graph HyperNetworks [154.60662664160333]
We propose a new FL framework that accommodates heterogeneous client architecture by adopting a graph hypernetwork for parameter sharing. Unlike existing solutions, our framework does not limit the clients to share the same architecture type, makes no use of external data and does not require clients to disclose their model architecture.
arXiv Detail & Related papers (2022-01-20T21:36:25Z)
Data Summarization via Bilevel Optimization [48.89977988203108]
A simple yet powerful approach is to operate on small subsets of data. In this work, we propose a generic coreset framework that formulates the coreset selection as a cardinality-constrained bilevel optimization problem.
arXiv Detail & Related papers (2021-09-26T09:08:38Z)
AutoRC: Improving BERT Based Relation Classification Models via Architecture Search [50.349407334562045]
BERT based relation classification (RC) models have achieved significant improvements over the traditional deep learning models. No consensus can be reached on what is the optimal architecture. We design a comprehensive search space for BERT based RC models and employ neural architecture search (NAS) method to automatically discover the design choices.
arXiv Detail & Related papers (2020-09-22T16:55:49Z)
CorDEL: A Contrastive Deep Learning Approach for Entity Linkage [70.82533554253335]
Entity linkage (EL) is a critical problem in data cleaning and integration. With the ever-increasing growth of new data, deep learning (DL) based approaches have been proposed to alleviate the high cost of EL associated with the traditional models. We argue that the twin-network architecture is sub-optimal to EL, leading to inherent drawbacks of existing models.
arXiv Detail & Related papers (2020-09-15T16:33:05Z)
NASE: Learning Knowledge Graph Embedding for Link Prediction via Neural Architecture Search [9.634626241415916]
Link prediction is the task of predicting missing connections between entities in the knowledge graph (KG) Previous work has tried to use Automated Machine Learning (AutoML) to search for the best model for a given dataset. We propose a novel Neural Architecture Search (NAS) framework for the link prediction task.
arXiv Detail & Related papers (2020-08-18T03:34:09Z)
Abstractive and mixed summarization for long-single documents [2.792030485253753]
This paper uses scientific papers as the dataset on which different models are trained. In this work, six different models are compared, two with an RNN architecture, one with a CNN architecture, two with a Transformer architecture and one with a Transformer architecture combined with reinforcement learning.
arXiv Detail & Related papers (2020-07-03T19:30:28Z)
Deep transfer learning for improving single-EEG arousal detection [63.52264764099532]
Two datasets do not contain exactly the same setup leading to degraded performance in single-EEG models. We train a baseline model and replace the first two layers to prepare the architecture for single-channel electroencephalography data. Using a fine-tuning strategy, our model yields similar performance to the baseline model and was significantly better than a comparable single-channel model.
arXiv Detail & Related papers (2020-04-10T16:51:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.