Polynomial-based Self-Attention for Table Representation learning
- URL: http://arxiv.org/abs/2312.07753v2
- Date: Mon, 18 Dec 2023 09:13:55 GMT
- Title: Polynomial-based Self-Attention for Table Representation learning
- Authors: Jayoung Kim, Yehjin Shin, Jeongwhan Choi, Hyowon Wi, Noseong Park
- Abstract summary: Self-attention, a key component of Transformers, can lead to an oversmoothing issue.
We propose a novel matrix-based self-attention layer as a substitute for the original self-attention layer.
In our experiments with three representative table learning models equipped with our proposed layer, we illustrate that the layer effectively mitigates the oversmoothing problem.
- Score: 23.651207486167518
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Structured data, which constitutes a significant portion of existing data
types, has been a long-standing research topic in the field of machine
learning. Various representation learning methods for tabular data have been
proposed, ranging from encoder-decoder structures to Transformers. Among these,
Transformer-based methods have achieved state-of-the-art performance not only
in tabular data but also in various other fields, including computer vision and
natural language processing. However, recent studies have revealed that
self-attention, a key component of Transformers, can lead to an oversmoothing
issue. We show that Transformers for tabular data also face this problem, and
to address the problem, we propose a novel matrix polynomial-based
self-attention layer as a substitute for the original self-attention layer,
which enhances model scalability. In our experiments with three representative
table learning models equipped with our proposed layer, we illustrate that the
layer effectively mitigates the oversmoothing problem and enhances the
representation performance of the existing methods, outperforming the
state-of-the-art table representation methods.
Related papers
- TabulaX: Leveraging Large Language Models for Multi-Class Table Transformations [8.072353085704627]
In this paper, we introduce TabulaX, a novel framework that leverages Large Language Models (LLMs) for multi-class transformations.
We show that TabulaX outperforms existing state-of-the-art approaches in terms of accuracy, supports a broader class of transformations, and generates interpretable transformations that can be efficiently applied.
arXiv Detail & Related papers (2024-11-26T05:00:23Z) - A Survey on Deep Tabular Learning [0.0]
Tabular data presents unique challenges for deep learning due to its heterogeneous nature and lack of spatial structure.
This survey reviews the evolution of deep learning models for Tabular data, from early fully connected networks (FCNs) to advanced architectures like TabNet, SAINT, TabTranSELU, and MambaNet.
arXiv Detail & Related papers (2024-10-15T20:08:08Z) - Scalable Representation Learning for Multimodal Tabular Transactions [14.18267117657451]
We present an innovative and scalable solution to these challenges.
We propose a parameter efficient decoder that interleaves transaction and text modalities.
We validate the efficacy of our solution on a large-scale dataset of synthetic payments transactions.
arXiv Detail & Related papers (2024-10-10T12:18:42Z) - Deep Learning with Tabular Data: A Self-supervised Approach [0.0]
We have used a self-supervised learning approach in this study.
The aim is to find the most effective TabTransformer model representation of categorical and numerical features.
The research has presented with a novel approach by creating various variants of TabTransformer model.
arXiv Detail & Related papers (2024-01-26T23:12:41Z) - Masked Modeling for Self-supervised Representation Learning on Vision
and Beyond [69.64364187449773]
Masked modeling has emerged as a distinctive approach that involves predicting parts of the original data that are proportionally masked during training.
We elaborate on the details of techniques within masked modeling, including diverse masking strategies, recovering targets, network architectures, and more.
We conclude by discussing the limitations of current techniques and point out several potential avenues for advancing masked modeling research.
arXiv Detail & Related papers (2023-12-31T12:03:21Z) - Images in Discrete Choice Modeling: Addressing Data Isomorphism in
Multi-Modality Inputs [77.54052164713394]
This paper explores the intersection of Discrete Choice Modeling (DCM) and machine learning.
We investigate the consequences of embedding high-dimensional image data that shares isomorphic information with traditional tabular inputs within a DCM framework.
arXiv Detail & Related papers (2023-12-22T14:33:54Z) - TabMT: Generating tabular data with masked transformers [0.0]
Masked Transformers are incredibly effective as generative models and classifiers.
This work contributes to the exploration of transformer-based models in synthetic data generation for diverse application domains.
arXiv Detail & Related papers (2023-12-11T03:28:11Z) - A Comprehensive Survey on Applications of Transformers for Deep Learning
Tasks [60.38369406877899]
Transformer is a deep neural network that employs a self-attention mechanism to comprehend the contextual relationships within sequential data.
transformer models excel in handling long dependencies between input sequence elements and enable parallel processing.
Our survey encompasses the identification of the top five application domains for transformer-based models.
arXiv Detail & Related papers (2023-06-11T23:13:51Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model [58.17021225930069]
We explain the rationality of Vision Transformer by analogy with the proven practical Evolutionary Algorithm (EA)
We propose a more efficient EAT model, and design task-related heads to deal with different tasks more flexibly.
Our approach achieves state-of-the-art results on the ImageNet classification task compared with recent vision transformer works.
arXiv Detail & Related papers (2021-05-31T16:20:03Z) - Efficient Transformers: A Survey [98.23264445730645]
Transformer model architectures have garnered immense interest lately due to their effectiveness across a range of domains like language, vision and reinforcement learning.
This paper characterizes a large and thoughtful selection of recent efficiency-flavored "X-former" models.
arXiv Detail & Related papers (2020-09-14T20:38:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.