Contrastive Mixup: Self- and Semi-Supervised learning for Tabular Domain
- URL: http://arxiv.org/abs/2108.12296v1
- Date: Fri, 27 Aug 2021 14:09:13 GMT
- Title: Contrastive Mixup: Self- and Semi-Supervised learning for Tabular Domain
- Authors: Sajad Darabi, Shayan Fazeli, Ali Pazoki, Sriram Sankararaman, Majid
Sarrafzadeh
- Abstract summary: We introduce Contrastive Mixup, a semi-supervised learning framework for tabular data.
Our proposed method leverages Mixup-based augmentation under the manifold assumption.
We demonstrate the effectiveness of the proposed framework on public datasets and real-world clinical datasets.
- Score: 8.25619425813267
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent literature in self-supervised has demonstrated significant progress in
closing the gap between supervised and unsupervised methods in the image and
text domains. These methods rely on domain-specific augmentations that are not
directly amenable to the tabular domain. Instead, we introduce Contrastive
Mixup, a semi-supervised learning framework for tabular data and demonstrate
its effectiveness in limited annotated data settings. Our proposed method
leverages Mixup-based augmentation under the manifold assumption by mapping
samples to a low dimensional latent space and encourage interpolated samples to
have high a similarity within the same labeled class. Unlabeled samples are
additionally employed via a transductive label propagation method to further
enrich the set of similar and dissimilar pairs that can be used in the
contrastive loss term. We demonstrate the effectiveness of the proposed
framework on public tabular datasets and real-world clinical datasets.
Related papers
- Graph-Based Semi-Supervised Segregated Lipschitz Learning [0.21847754147782888]
This paper presents an approach to semi-supervised learning for the classification of data using the Lipschitz Learning on graphs.
We develop a graph-based semi-supervised learning framework that leverages the properties of the infinity Laplacian to propagate labels in a dataset where only a few samples are labeled.
arXiv Detail & Related papers (2024-11-05T17:16:56Z) - Downstream-Pretext Domain Knowledge Traceback for Active Learning [138.02530777915362]
We propose a downstream-pretext domain knowledge traceback (DOKT) method that traces the data interactions of downstream knowledge and pre-training guidance.
DOKT consists of a traceback diversity indicator and a domain-based uncertainty estimator.
Experiments conducted on ten datasets show that our model outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2024-07-20T01:34:13Z) - Supervised Stochastic Neighbor Embedding Using Contrastive Learning [4.560284382063488]
Clusters of samples belonging to the same class are pulled together in low-dimensional embedding space.
We extend the self-supervised contrastive approach to the fully-supervised setting, allowing us to effectively leverage label information.
arXiv Detail & Related papers (2023-09-15T00:26:21Z) - Progressive Feature Upgrade in Semi-supervised Learning on Tabular
Domain [0.0]
Recent semi-supervised and self-supervised methods have shown great success in the image and text domain.
It is not easy to adapt domain-specific transformations from image and language to tabular data due to mixing of different data types.
We propose using conditional probability representation and an efficient progressively feature upgrading framework.
arXiv Detail & Related papers (2022-12-01T22:18:32Z) - Mutual- and Self- Prototype Alignment for Semi-supervised Medical Image
Segmentation [5.426994893258762]
We propose a mutual- and self- prototype alignment (MSPA) framework to better utilize the unlabeled data.
In specific, mutual-prototype alignment enhances the information interaction between labeled and unlabeled data.
Our method also outperforms seven state-of-the-art semi-supervised segmentation methods on all three datasets.
arXiv Detail & Related papers (2022-06-03T02:59:22Z) - Information Symmetry Matters: A Modal-Alternating Propagation Network
for Few-Shot Learning [118.45388912229494]
We propose a Modal-Alternating Propagation Network (MAP-Net) to supplement the absent semantic information of unlabeled samples.
We design a Relation Guidance (RG) strategy to guide the visual relation vectors via semantics so that the propagated information is more beneficial.
Our proposed method achieves promising performance and outperforms the state-of-the-art approaches.
arXiv Detail & Related papers (2021-09-03T03:43:53Z) - GuidedMix-Net: Learning to Improve Pseudo Masks Using Labeled Images as
Reference [153.354332374204]
We propose a novel method for semi-supervised semantic segmentation named GuidedMix-Net.
We first introduce a feature alignment objective between labeled and unlabeled data to capture potentially similar image pairs.
MITrans is shown to be a powerful knowledge module for further progressive refining features of unlabeled data.
Along with supervised learning for labeled data, the prediction of unlabeled data is jointly learned with the generated pseudo masks.
arXiv Detail & Related papers (2021-06-29T02:48:45Z) - CrowdTeacher: Robust Co-teaching with Noisy Answers & Sample-specific
Perturbations for Tabular Data [8.276156981100364]
Co-teaching methods have shown promising improvements for computer vision problems with noisy labels.
Our model, CrowdTeacher, uses the idea that robustness in the input space model can improve the perturbation of the classifier for noisy labels.
We showcase the boost in predictive power attained using CrowdTeacher for both synthetic and real datasets.
arXiv Detail & Related papers (2021-03-31T15:09:38Z) - Towards Domain-Agnostic Contrastive Learning [103.40783553846751]
We propose a novel domain-agnostic approach to contrastive learning, named DACL.
Key to our approach is the use of Mixup noise to create similar and dissimilar examples by mixing data samples differently either at the input or hidden-state levels.
Our results show that DACL not only outperforms other domain-agnostic noising methods, such as Gaussian-noise, but also combines well with domain-specific methods, such as SimCLR.
arXiv Detail & Related papers (2020-11-09T13:41:56Z) - Deep Semi-supervised Knowledge Distillation for Overlapping Cervical
Cell Instance Segmentation [54.49894381464853]
We propose to leverage both labeled and unlabeled data for instance segmentation with improved accuracy by knowledge distillation.
We propose a novel Mask-guided Mean Teacher framework with Perturbation-sensitive Sample Mining.
Experiments show that the proposed method improves the performance significantly compared with the supervised method learned from labeled data only.
arXiv Detail & Related papers (2020-07-21T13:27:09Z) - MatchGAN: A Self-Supervised Semi-Supervised Conditional Generative
Adversarial Network [51.84251358009803]
We present a novel self-supervised learning approach for conditional generative adversarial networks (GANs) under a semi-supervised setting.
We perform augmentation by randomly sampling sensible labels from the label space of the few labelled examples available.
Our method surpasses the baseline with only 20% of the labelled examples used to train the baseline.
arXiv Detail & Related papers (2020-06-11T17:14:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.