TabKANet: Tabular Data Modeling with Kolmogorov-Arnold Network and Transformer
- URL: http://arxiv.org/abs/2409.08806v2
- Date: Wed, 2 Oct 2024 06:22:48 GMT
- Title: TabKANet: Tabular Data Modeling with Kolmogorov-Arnold Network and Transformer
- Authors: Weihao Gao, Zheng Gong, Zhuo Deng, Fuju Rong, Chucheng Chen, Lan Ma,
- Abstract summary: TabKANet is a model for learning from numerical content.
It has superior performance compared to Neural Networks (NNs)
Our code is publicly available on GitHub.
- Score: 12.237450884462888
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Tabular data is the most common type of data in real-life scenarios. In this study, we propose the TabKANet model for tabular data modeling, which targets the bottlenecks in learning from numerical content. We constructed a Kolmogorov-Arnold Network (KAN) based Numerical Embedding Module and unified numerical and categorical features encoding within a Transformer architecture. TabKANet has demonstrated stable and significantly superior performance compared to Neural Networks (NNs) across multiple public datasets in binary classification, multi-class classification, and regression tasks. Its performance is comparable to or surpasses that of Gradient Boosted Decision Tree models (GBDTs). Our code is publicly available on GitHub: https://github.com/AI-thpremed/TabKANet.
Related papers
- Making Pre-trained Language Models Great on Tabular Prediction [50.70574370855663]
The transferability of deep neural networks (DNNs) has made significant progress in image and language processing.
We present TP-BERTa, a specifically pre-trained LM for tabular data prediction.
A novel relative magnitude tokenization converts scalar numerical feature values to finely discrete, high-dimensional tokens, and an intra-feature attention approach integrates feature values with the corresponding feature names.
arXiv Detail & Related papers (2024-03-04T08:38:56Z) - MotherNet: A Foundational Hypernetwork for Tabular Classification [1.9643748953805937]
We propose a hypernetwork architecture that we call MotherNet, trained on millions of classification tasks.
MotherNet replaces training on specific datasets with in-context learning through a single forward pass.
The child network generated by MotherNet using in-context learning outperforms neural networks trained using gradient descent on small datasets.
arXiv Detail & Related papers (2023-12-14T01:48:58Z) - Training-Free Generalization on Heterogeneous Tabular Data via
Meta-Representation [67.30538142519067]
We propose Tabular data Pre-Training via Meta-representation (TabPTM)
A deep neural network is then trained to associate these meta-representations with dataset-specific classification confidences.
Experiments validate that TabPTM achieves promising performance in new datasets, even under few-shot scenarios.
arXiv Detail & Related papers (2023-10-31T18:03:54Z) - Graph Neural Network contextual embedding for Deep Learning on Tabular
Data [0.45880283710344055]
Deep Learning (DL) has constituted a major breakthrough for AI in fields related to human skills like natural language processing.
This paper presents a novel DL model using Graph Neural Network (GNN) more specifically Interaction Network (IN)
Its results outperform those of a recently published survey with DL benchmark based on five public datasets, also achieving competitive results when compared to boosted-tree solutions.
arXiv Detail & Related papers (2023-03-11T17:13:24Z) - Neural Attentive Circuits [93.95502541529115]
We introduce a general purpose, yet modular neural architecture called Neural Attentive Circuits (NACs)
NACs learn the parameterization and a sparse connectivity of neural modules without using domain knowledge.
NACs achieve an 8x speedup at inference time while losing less than 3% performance.
arXiv Detail & Related papers (2022-10-14T18:00:07Z) - TabPFN: A Transformer That Solves Small Tabular Classification Problems
in a Second [48.87527918630822]
We present TabPFN, a trained Transformer that can do supervised classification for small datasets in less than a second.
TabPFN performs in-context learning (ICL), it learns to make predictions using sequences of labeled examples.
We show that our method clearly outperforms boosted trees and performs on par with complex state-of-the-art AutoML systems with up to 230$times$ speedup.
arXiv Detail & Related papers (2022-07-05T07:17:43Z) - A Robust Stacking Framework for Training Deep Graph Models with
Multifaceted Node Features [61.92791503017341]
Graph Neural Networks (GNNs) with numerical node features and graph structure as inputs have demonstrated superior performance on various supervised learning tasks with graph data.
The best models for such data types in most standard supervised learning settings with IID (non-graph) data are not easily incorporated into a GNN.
Here we propose a robust stacking framework that fuses graph-aware propagation with arbitrary models intended for IID data.
arXiv Detail & Related papers (2022-06-16T22:46:33Z) - DANets: Deep Abstract Networks for Tabular Data Classification and
Regression [9.295859461145783]
Abstract Layer (AbstLay) learns to explicitly group correlative input features and generate higher-level features for semantics abstraction.
Family of Deep Abstract Networks (DANets) for tabular data classification and regression.
arXiv Detail & Related papers (2021-12-06T12:15:28Z) - TabGNN: Multiplex Graph Neural Network for Tabular Data Prediction [43.35301059378836]
We propose a novel framework TabGNN based on recently popular graph neural networks (GNN)
Specifically, we firstly construct a multiplex graph to model the multifaceted sample relations, and then design a multiplex graph neural network to learn enhanced representation for each sample.
Experiments on eleven TDP datasets from various domains, including classification and regression ones, show that TabGNN can consistently improve the performance.
arXiv Detail & Related papers (2021-08-20T11:51:32Z) - Tabular Transformers for Modeling Multivariate Time Series [30.717890753132824]
Tabular datasets are ubiquitous in data science applications. Given their importance, it seems natural to apply state-of-the-art deep learning algorithms in order to fully unlock their potential.
Here we propose neural network models that represent tabular time series that can leverage their hierarchical structure.
We demonstrate our models on two datasets: a synthetic credit card transaction dataset, where the learned representations are used for fraud detection and synthetic data generation, and on a real pollution dataset, where the learned encodings are used to predict atmospheric pollutant concentrations.
arXiv Detail & Related papers (2020-11-03T16:58:08Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.