A Survey on Self-Supervised Learning for Non-Sequential Tabular Data
- URL: http://arxiv.org/abs/2402.01204v4
- Date: Tue, 10 Sep 2024 07:02:47 GMT
- Title: A Survey on Self-Supervised Learning for Non-Sequential Tabular Data
- Authors: Wei-Yao Wang, Wei-Wei Du, Derek Xu, Wei Wang, Wen-Chih Peng,
- Abstract summary: Self-supervised learning (SSL) has been incorporated into many state-of-the-art models in various domains.
This survey aims to systematically review and summarize the recent progress and challenges of SSL for non-sequential data (SSL4NS-TD)
We first present a formal definition of NS-TD and clarify its correlation to related studies. Then, these approaches are categorized into three groups - predictive learning, contrastive learning, and hybrid learning, with their motivations and strengths of representative methods in each direction.
- Score: 15.796140543132196
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-supervised learning (SSL) has been incorporated into many state-of-the-art models in various domains, where SSL defines pretext tasks based on unlabeled datasets to learn contextualized and robust representations. Recently, SSL has become a new trend in exploring the representation learning capability in the realm of tabular data, which is more challenging due to not having explicit relations for learning descriptive representations. This survey aims to systematically review and summarize the recent progress and challenges of SSL for non-sequential tabular data (SSL4NS-TD). We first present a formal definition of NS-TD and clarify its correlation to related studies. Then, these approaches are categorized into three groups - predictive learning, contrastive learning, and hybrid learning, with their motivations and strengths of representative methods in each direction. Moreover, application issues of SSL4NS-TD are presented, including automatic data engineering, cross-table transferability, and domain knowledge integration. In addition, we elaborate on existing benchmarks and datasets for NS-TD applications to analyze the performance of existing tabular models. Finally, we discuss the challenges of SSL4NS-TD and provide potential directions for future research. We expect our work to be useful in terms of encouraging more research on lowering the barrier to entry SSL for the tabular domain, and of improving the foundations for implicit tabular data.
Related papers
- Unified Approaches in Self-Supervised Event Stream Modeling: Progress and Prospects [2.491611479869045]
Self-Supervised Learning (SSL) has emerged as a promising paradigm to address these challenges.
We systematically review and synthesize SSL methodologies tailored for ES modeling across multiple domains.
We propose a future research agenda aimed at developing scalable, domain-agnostic SSL frameworks for ES modeling.
arXiv Detail & Related papers (2025-02-07T13:05:55Z) - A Survey of the Self Supervised Learning Mechanisms for Vision Transformers [5.152455218955949]
The application of self supervised learning (SSL) in vision tasks has gained significant attention.
We develop a comprehensive taxonomy of systematically classifying the SSL techniques.
We discuss the motivations behind SSL, review popular pre-training tasks, and highlight the challenges and advancements in this field.
arXiv Detail & Related papers (2024-08-30T07:38:28Z) - Self-Supervision for Tackling Unsupervised Anomaly Detection: Pitfalls
and Opportunities [50.231837687221685]
Self-supervised learning (SSL) has transformed machine learning and its many real world applications.
Unsupervised anomaly detection (AD) has also capitalized on SSL, by self-generating pseudo-anomalies.
arXiv Detail & Related papers (2023-08-28T07:55:01Z) - Self-Supervised Learning for Time Series Analysis: Taxonomy, Progress, and Prospects [84.6945070729684]
Self-supervised learning (SSL) has recently achieved impressive performance on various time series tasks.
This article reviews current state-of-the-art SSL methods for time series data.
arXiv Detail & Related papers (2023-06-16T18:23:10Z) - Self-Supervised Learning for Point Clouds Data: A Survey [8.858165912687923]
Self-Supervised Learning (SSL) is considered as an essential solution to solve the time-consuming and labor-intensive data labelling problems.
This paper provides a comprehensive survey of recent advances on SSL for point clouds.
arXiv Detail & Related papers (2023-05-09T08:47:09Z) - Explaining, Analyzing, and Probing Representations of Self-Supervised
Learning Models for Sensor-based Human Activity Recognition [2.2082422928825136]
Self-supervised learning (SSL) frameworks have been extensively applied to sensor-based Human Activity Recognition (HAR)
In this paper, we aim to analyze deep representations of two recent SSL frameworks, namely SimCLR and VICReg.
arXiv Detail & Related papers (2023-04-14T07:53:59Z) - A Survey on Self-supervised Learning: Algorithms, Applications, and Future Trends [82.64268080902742]
Self-supervised learning (SSL) aims to learn discriminative features from unlabeled data without relying on human-annotated labels.
SSL has garnered significant attention recently, leading to the development of numerous related algorithms.
This paper presents a review of diverse SSL methods, encompassing algorithmic aspects, application domains, three key trends, and open research questions.
arXiv Detail & Related papers (2023-01-13T14:41:05Z) - Semi-Supervised and Unsupervised Deep Visual Learning: A Survey [76.2650734930974]
Semi-supervised learning and unsupervised learning offer promising paradigms to learn from an abundance of unlabeled visual data.
We review the recent advanced deep learning algorithms on semi-supervised learning (SSL) and unsupervised learning (UL) for visual recognition from a unified perspective.
arXiv Detail & Related papers (2022-08-24T04:26:21Z) - DATA: Domain-Aware and Task-Aware Pre-training [94.62676913928831]
We present DATA, a simple yet effective NAS approach specialized for self-supervised learning (SSL)
Our method achieves promising results across a wide range of computation costs on downstream tasks, including image classification, object detection and semantic segmentation.
arXiv Detail & Related papers (2022-03-17T02:38:49Z) - Self-supervised on Graphs: Contrastive, Generative,or Predictive [25.679620842010422]
Self-supervised learning (SSL) is emerging as a new paradigm for extracting informative knowledge through well-designed pretext tasks.
We divide existing graph SSL methods into three categories: contrastive, generative, and predictive.
We also summarize the commonly used datasets, evaluation metrics, downstream tasks, and open-source implementations of various algorithms.
arXiv Detail & Related papers (2021-05-16T03:30:03Z) - Graph-based Semi-supervised Learning: A Comprehensive Review [51.26862262550445]
Semi-supervised learning (SSL) has tremendous value in practice due to its ability to utilize both labeled data and unlabelled data.
An important class of SSL methods is to naturally represent data as graphs, which corresponds to graph-based semi-supervised learning (GSSL) methods.
GSSL methods have demonstrated their advantages in various domains due to their uniqueness of structure, the universality of applications, and their scalability to large scale data.
arXiv Detail & Related papers (2021-02-26T05:11:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.