Local Contrastive Feature learning for Tabular Data
- URL: http://arxiv.org/abs/2211.10549v1
- Date: Sat, 19 Nov 2022 00:53:41 GMT
- Title: Local Contrastive Feature learning for Tabular Data
- Authors: Zhabiz Gharibshah, Xingquan Zhu
- Abstract summary: We propose a new local contrastive feature learning framework (LoCL)
In order to create a niche for local learning, we use feature correlations to create a maximum-spanning tree, and break the tree into feature subsets.
Convolutional learning of the features is used to learn latent feature space, regulated by contrastive and reconstruction losses.
- Score: 8.93957397187611
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Contrastive self-supervised learning has been successfully used in many
domains, such as images, texts, graphs, etc., to learn features without
requiring label information. In this paper, we propose a new local contrastive
feature learning (LoCL) framework, and our theme is to learn local
patterns/features from tabular data. In order to create a niche for local
learning, we use feature correlations to create a maximum-spanning tree, and
break the tree into feature subsets, with strongly correlated features being
assigned next to each other. Convolutional learning of the features is used to
learn latent feature space, regulated by contrastive and reconstruction losses.
Experiments on public tabular datasets show the effectiveness of the proposed
method versus state-of-the-art baseline methods.
Related papers
- TabSeq: A Framework for Deep Learning on Tabular Data via Sequential Ordering [5.946579489162407]
This work introduces TabSeq, a novel framework for the sequential ordering of features.
Finding the optimum sequence order for such features could improve the deep learning models' learning process.
arXiv Detail & Related papers (2024-10-17T04:10:36Z) - Optimized Feature Generation for Tabular Data via LLMs with Decision Tree Reasoning [53.241569810013836]
We propose a new framework based on large language models (LLMs) and decision Tree reasoning (OCTree)
Our key idea is to leverage LLMs' reasoning capabilities to find good feature generation rules without manually specifying the search space.
Our empirical results demonstrate that this simple framework consistently enhances the performance of various prediction models.
arXiv Detail & Related papers (2024-06-12T08:31:34Z) - Binning as a Pretext Task: Improving Self-Supervised Learning in Tabular Domains [0.565395466029518]
We propose a novel pretext task based on the classical binning method.
The idea is straightforward: reconstructing the bin indices (either orders or classes) rather than the original values.
Our empirical investigations ascertain several advantages of binning.
arXiv Detail & Related papers (2024-05-13T01:23:14Z) - Learning Representations without Compositional Assumptions [79.12273403390311]
We propose a data-driven approach that learns feature set dependencies by representing feature sets as graph nodes and their relationships as learnable edges.
We also introduce LEGATO, a novel hierarchical graph autoencoder that learns a smaller, latent graph to aggregate information from multiple views dynamically.
arXiv Detail & Related papers (2023-05-31T10:36:10Z) - Transfer Learning with Deep Tabular Models [66.67017691983182]
We show that upstream data gives tabular neural networks a decisive advantage over GBDT models.
We propose a realistic medical diagnosis benchmark for tabular transfer learning.
We propose a pseudo-feature method for cases where the upstream and downstream feature sets differ.
arXiv Detail & Related papers (2022-06-30T14:24:32Z) - SubTab: Subsetting Features of Tabular Data for Self-Supervised
Representation Learning [5.5616364225463055]
We introduce a new framework, Subsetting features of Tabular data (SubTab)
In this paper, we introduce a new framework, Subsetting features of Tabular data (SubTab)
We argue that reconstructing the data from the subset of its features rather than its corrupted version in an autoencoder setting can better capture its underlying representation.
arXiv Detail & Related papers (2021-10-08T20:11:09Z) - Capturing Structural Locality in Non-parametric Language Models [85.94669097485992]
We propose a simple yet effective approach for adding locality information into non-parametric language models.
Experiments on two different domains, Java source code and Wikipedia text, demonstrate that locality features improve model efficacy.
arXiv Detail & Related papers (2021-10-06T15:53:38Z) - Deep Reinforcement Learning of Graph Matching [63.469961545293756]
Graph matching (GM) under node and pairwise constraints has been a building block in areas from optimization to computer vision.
We present a reinforcement learning solver for GM i.e. RGM that seeks the node correspondence between pairwise graphs.
Our method differs from the previous deep graph matching model in the sense that they are focused on the front-end feature extraction and affinity function learning.
arXiv Detail & Related papers (2020-12-16T13:48:48Z) - SLADE: A Self-Training Framework For Distance Metric Learning [75.54078592084217]
We present a self-training framework, SLADE, to improve retrieval performance by leveraging additional unlabeled data.
We first train a teacher model on the labeled data and use it to generate pseudo labels for the unlabeled data.
We then train a student model on both labels and pseudo labels to generate final feature embeddings.
arXiv Detail & Related papers (2020-11-20T08:26:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.