Progressive Feature Upgrade in Semi-supervised Learning on Tabular
Domain
- URL: http://arxiv.org/abs/2212.00892v1
- Date: Thu, 1 Dec 2022 22:18:32 GMT
- Title: Progressive Feature Upgrade in Semi-supervised Learning on Tabular
Domain
- Authors: Morteza Mohammady Gharasuie, Fenjiao Wang
- Abstract summary: Recent semi-supervised and self-supervised methods have shown great success in the image and text domain.
It is not easy to adapt domain-specific transformations from image and language to tabular data due to mixing of different data types.
We propose using conditional probability representation and an efficient progressively feature upgrading framework.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Recent semi-supervised and self-supervised methods have shown great success
in the image and text domain by utilizing augmentation techniques. Despite such
success, it is not easy to transfer this success to tabular domains. It is not
easy to adapt domain-specific transformations from image and language to
tabular data due to mixing of different data types (continuous data and
categorical data) in the tabular domain. There are a few semi-supervised works
on the tabular domain that have focused on proposing new augmentation
techniques for tabular data. These approaches may have shown some improvement
on datasets with low-cardinality in categorical data. However, the fundamental
challenges have not been tackled. The proposed methods either do not apply to
datasets with high-cardinality or do not use an efficient encoding of
categorical data. We propose using conditional probability representation and
an efficient progressively feature upgrading framework to effectively learn
representations for tabular data in semi-supervised applications. The extensive
experiments show superior performance of the proposed framework and the
potential application in semi-supervised settings.
Related papers
- Training-Free Generalization on Heterogeneous Tabular Data via
Meta-Representation [67.30538142519067]
We propose Tabular data Pre-Training via Meta-representation (TabPTM)
A deep neural network is then trained to associate these meta-representations with dataset-specific classification confidences.
Experiments validate that TabPTM achieves promising performance in new datasets, even under few-shot scenarios.
arXiv Detail & Related papers (2023-10-31T18:03:54Z) - Revisiting Self-Training with Regularized Pseudo-Labeling for Tabular
Data [0.0]
We revisit self-training which can be applied to any kind of algorithm including gradient boosting decision tree.
We propose a novel pseudo-labeling approach that regularizes the confidence scores based on the likelihoods of the pseudo-labels.
arXiv Detail & Related papers (2023-02-27T18:12:56Z) - Bi-level Alignment for Cross-Domain Crowd Counting [113.78303285148041]
Current methods rely on external data for training an auxiliary task or apply an expensive coarse-to-fine estimation.
We develop a new adversarial learning based method, which is simple and efficient to apply.
We evaluate our approach on five real-world crowd counting benchmarks, where we outperform existing approaches by a large margin.
arXiv Detail & Related papers (2022-05-12T02:23:25Z) - Dominant Set-based Active Learning for Text Classification and its
Application to Online Social Media [0.0]
We present a novel pool-based active learning method for the training of large unlabeled corpus with minimum annotation cost.
Our proposed method does not have any parameters to be tuned, making it dataset-independent.
Our method achieves a higher performance in comparison to the state-of-the-art active learning strategies.
arXiv Detail & Related papers (2022-01-28T19:19:03Z) - CvS: Classification via Segmentation For Small Datasets [52.821178654631254]
This paper presents CvS, a cost-effective classifier for small datasets that derives the classification labels from predicting the segmentation maps.
We evaluate the effectiveness of our framework on diverse problems showing that CvS is able to achieve much higher classification results compared to previous methods when given only a handful of examples.
arXiv Detail & Related papers (2021-10-29T18:41:15Z) - Contrastive Mixup: Self- and Semi-Supervised learning for Tabular Domain [8.25619425813267]
We introduce Contrastive Mixup, a semi-supervised learning framework for tabular data.
Our proposed method leverages Mixup-based augmentation under the manifold assumption.
We demonstrate the effectiveness of the proposed framework on public datasets and real-world clinical datasets.
arXiv Detail & Related papers (2021-08-27T14:09:13Z) - SCARF: Self-Supervised Contrastive Learning using Random Feature
Corruption [72.35532598131176]
We propose SCARF, a technique for contrastive learning, where views are formed by corrupting a random subset of features.
We show that SCARF complements existing strategies and outperforms alternatives like autoencoders.
arXiv Detail & Related papers (2021-06-29T08:08:33Z) - Semi-supervised Meta-learning with Disentanglement for
Domain-generalised Medical Image Segmentation [15.351113774542839]
Generalising models to new data from new centres (termed here domains) remains a challenge.
We propose a novel semi-supervised meta-learning framework with disentanglement.
We show that the proposed method is robust on different segmentation tasks and achieves state-of-the-art generalisation performance on two public benchmarks.
arXiv Detail & Related papers (2021-06-24T19:50:07Z) - i-Mix: A Domain-Agnostic Strategy for Contrastive Representation
Learning [117.63815437385321]
We propose i-Mix, a simple yet effective domain-agnostic regularization strategy for improving contrastive representation learning.
In experiments, we demonstrate that i-Mix consistently improves the quality of learned representations across domains.
arXiv Detail & Related papers (2020-10-17T23:32:26Z) - $n$-Reference Transfer Learning for Saliency Prediction [73.17061116358036]
We propose a few-shot transfer learning paradigm for saliency prediction.
The proposed framework is gradient-based and model-agnostic.
The results show that the proposed framework achieves a significant performance improvement.
arXiv Detail & Related papers (2020-07-09T23:20:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.