Pseudo-label Based Domain Adaptation for Zero-Shot Text Steganalysis
- URL: http://arxiv.org/abs/2406.18565v1
- Date: Sat, 1 Jun 2024 04:19:07 GMT
- Title: Pseudo-label Based Domain Adaptation for Zero-Shot Text Steganalysis
- Authors: Yufei Luo, Zhen Yang, Ru Zhang, Jianyi Liu,
- Abstract summary: Cross-domain stego-text analysis method (PDTS) based on pseudo-labeling and domain adaptation (unsupervised learning)
We train the model using labeled source domain data and adapt it to target domain data distribution using pseudo-labels for unlabeled target domain data through self-training.
Experimental results demonstrate that our method performs well in zero-shot text steganalysis tasks, achieving high detection accuracy even in the absence of labeled data in the target domain, and outperforms current zero-shot text steganalysis methods.
- Score: 10.587545153412314
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Currently, most methods for text steganalysis are based on deep neural networks (DNNs). However, in real-life scenarios, obtaining a sufficient amount of labeled stego-text for correctly training networks using a large number of parameters is often challenging and costly. Additionally, due to a phenomenon known as dataset bias or domain shift, recognition models trained on a large dataset exhibit poor generalization performance on novel datasets and tasks. Therefore, to address the issues of missing labeled data and inadequate model generalization in text steganalysis, this paper proposes a cross-domain stego-text analysis method (PDTS) based on pseudo-labeling and domain adaptation (unsupervised learning). Specifically, we propose a model architecture combining pre-trained BERT with a single-layer Bi-LSTM to learn and extract generic features across tasks and generate task-specific representations. Considering the differential contributions of different features to steganalysis, we further design a feature filtering mechanism to achieve selective feature propagation, thereby enhancing classification performance. We train the model using labeled source domain data and adapt it to target domain data distribution using pseudo-labels for unlabeled target domain data through self-training. In the label estimation step, instead of using a static sampling strategy, we propose a progressive sampling strategy to gradually increase the number of selected pseudo-label candidates. Experimental results demonstrate that our method performs well in zero-shot text steganalysis tasks, achieving high detection accuracy even in the absence of labeled data in the target domain, and outperforms current zero-shot text steganalysis methods.
Related papers
- Downstream-Pretext Domain Knowledge Traceback for Active Learning [138.02530777915362]
We propose a downstream-pretext domain knowledge traceback (DOKT) method that traces the data interactions of downstream knowledge and pre-training guidance.
DOKT consists of a traceback diversity indicator and a domain-based uncertainty estimator.
Experiments conducted on ten datasets show that our model outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2024-07-20T01:34:13Z) - Multi-Source Soft Pseudo-Label Learning with Domain Similarity-based
Weighting for Semantic Segmentation [2.127049691404299]
This paper describes a method of domain adaptive training for semantic segmentation using multiple source datasets.
We propose a soft pseudo-label generation method by integrating predicted object probabilities from multiple source models.
arXiv Detail & Related papers (2023-03-02T05:20:36Z) - Robust Target Training for Multi-Source Domain Adaptation [110.77704026569499]
We propose a novel Bi-level Optimization based Robust Target Training (BORT$2$) method for MSDA.
Our proposed method achieves the state of the art performance on three MSDA benchmarks, including the large-scale DomainNet dataset.
arXiv Detail & Related papers (2022-10-04T15:20:01Z) - Domain Adaptation Principal Component Analysis: base linear method for
learning with out-of-distribution data [55.41644538483948]
Domain adaptation is a popular paradigm in modern machine learning.
We present a method called Domain Adaptation Principal Component Analysis (DAPCA)
DAPCA finds a linear reduced data representation useful for solving the domain adaptation task.
arXiv Detail & Related papers (2022-08-28T21:10:56Z) - Low-confidence Samples Matter for Domain Adaptation [47.552605279925736]
Domain adaptation (DA) aims to transfer knowledge from a label-rich source domain to a related but label-scarce target domain.
We propose a novel contrastive learning method by processing low-confidence samples.
We evaluate the proposed method in both unsupervised and semi-supervised DA settings.
arXiv Detail & Related papers (2022-02-06T15:45:45Z) - Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D
Object Detection [85.11649974840758]
3D object detection networks tend to be biased towards the data they are trained on.
We propose a single-frame approach for source-free, unsupervised domain adaptation of lidar-based 3D object detectors.
arXiv Detail & Related papers (2021-11-30T18:42:42Z) - Adapting Segmentation Networks to New Domains by Disentangling Latent
Representations [14.050836886292869]
Domain adaptation approaches have come into play to transfer knowledge acquired on a label-abundant source domain to a related label-scarce target domain.
We propose a novel performance metric to capture the relative efficacy of an adaptation strategy compared to supervised training.
arXiv Detail & Related papers (2021-08-06T09:43:07Z) - TraND: Transferable Neighborhood Discovery for Unsupervised Cross-domain
Gait Recognition [77.77786072373942]
This paper proposes a Transferable Neighborhood Discovery (TraND) framework to bridge the domain gap for unsupervised cross-domain gait recognition.
We design an end-to-end trainable approach to automatically discover the confident neighborhoods of unlabeled samples in the latent space.
Our method achieves state-of-the-art results on two public datasets, i.e., CASIA-B and OU-LP.
arXiv Detail & Related papers (2021-02-09T03:07:07Z) - PseudoSeg: Designing Pseudo Labels for Semantic Segmentation [78.35515004654553]
We present a re-design of pseudo-labeling to generate structured pseudo labels for training with unlabeled or weakly-labeled data.
We demonstrate the effectiveness of the proposed pseudo-labeling strategy in both low-data and high-data regimes.
arXiv Detail & Related papers (2020-10-19T17:59:30Z) - Unsupervised Domain Adaptation for Person Re-Identification through
Source-Guided Pseudo-Labeling [2.449909275410288]
Person Re-Identification (re-ID) aims at retrieving images of the same person taken by different cameras.
Unsupervised Domain Adaptation (UDA) is an interesting research direction for this challenge as it avoids a costly annotation of the target data.
We introduce a framework which relies on a two-branch architecture optimizing classification and triplet loss based metric learning in source and target domains.
arXiv Detail & Related papers (2020-09-20T14:54:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.