Leveraging Data Geometry to Mitigate CSM in Steganalysis
- URL: http://arxiv.org/abs/2310.04479v1
- Date: Fri, 6 Oct 2023 09:08:25 GMT
- Title: Leveraging Data Geometry to Mitigate CSM in Steganalysis
- Authors: Rony Abecidan (CRIStAL, CNRS), Vincent Itier (IMT Nord Europe,
CRIStAL), J\'er\'emie Boulanger (CRIStAL), Patrick Bas (CRIStAL, CNRS),
Tom\'a\v{s} Pevn\'y (CTU)
- Abstract summary: In operational scenarios, steganographers use sets of covers from various sensors and processing pipelines that differ significantly from those used by researchers to train steganalysis models.
This leads to an inevitable performance gap when dealing with out-of-distribution covers, commonly referred to as Cover Source Mismatch (CSM)
In this study, we consider the scenario where test images are processed using the same pipeline. Our objective is to identify a training dataset that allows for maximum generalization to our target.
- Score: 1.130790932059036
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In operational scenarios, steganographers use sets of covers from various
sensors and processing pipelines that differ significantly from those used by
researchers to train steganalysis models. This leads to an inevitable
performance gap when dealing with out-of-distribution covers, commonly referred
to as Cover Source Mismatch (CSM). In this study, we consider the scenario
where test images are processed using the same pipeline. However, knowledge
regarding both the labels and the balance between cover and stego is missing.
Our objective is to identify a training dataset that allows for maximum
generalization to our target. By exploring a grid of processing pipelines
fostering CSM, we discovered a geometrical metric based on the chordal distance
between subspaces spanned by DCTr features, that exhibits high correlation with
operational regret while being not affected by the cover-stego balance. Our
contribution lies in the development of a strategy that enables the selection
or derivation of customized training datasets, enhancing the overall
generalization performance for a given target. Experimental validation
highlights that our geometry-based optimization strategy outperforms
traditional atomistic methods given reasonable assumptions. Additional
resources are available at
github.com/RonyAbecidan/LeveragingGeometrytoMitigateCSM.
Related papers
- DRED: Zero-Shot Transfer in Reinforcement Learning via Data-Regularised Environment Design [11.922951794283168]
In this work, we investigate how the sampling of individual environment instances, or levels, affects the zero-shot generalisation (ZSG) ability of RL agents.
We discover that for deep actor-critic architectures sharing their base layers, prioritising levels according to their value loss minimises the mutual information between the agent's internal representation and the set of training levels in the generated training data.
We find that existing UED methods can significantly shift the training distribution, which translates to low ZSG performance.
To prevent both overfitting and distributional shift, we introduce data-regularised environment design (D
arXiv Detail & Related papers (2024-02-05T19:47:45Z) - RGM: A Robust Generalizable Matching Model [49.60975442871967]
We propose a deep model for sparse and dense matching, termed RGM (Robust Generalist Matching)
To narrow the gap between synthetic training samples and real-world scenarios, we build a new, large-scale dataset with sparse correspondence ground truth.
We are able to mix up various dense and sparse matching datasets, significantly improving the training diversity.
arXiv Detail & Related papers (2023-10-18T07:30:08Z) - Consistency Regularization for Generalizable Source-free Domain
Adaptation [62.654883736925456]
Source-free domain adaptation (SFDA) aims to adapt a well-trained source model to an unlabelled target domain without accessing the source dataset.
Existing SFDA methods ONLY assess their adapted models on the target training set, neglecting the data from unseen but identically distributed testing sets.
We propose a consistency regularization framework to develop a more generalizable SFDA method.
arXiv Detail & Related papers (2023-08-03T07:45:53Z) - Divide and Contrast: Source-free Domain Adaptation via Adaptive
Contrastive Learning [122.62311703151215]
Divide and Contrast (DaC) aims to connect the good ends of both worlds while bypassing their limitations.
DaC divides the target data into source-like and target-specific samples, where either group of samples is treated with tailored goals.
We further align the source-like domain with the target-specific samples using a memory bank-based Maximum Mean Discrepancy (MMD) loss to reduce the distribution mismatch.
arXiv Detail & Related papers (2022-11-12T09:21:49Z) - Using Set Covering to Generate Databases for Holistic Steganalysis [2.089615335919449]
We explore a grid of processing pipelines to study the origins of Cover Source Mismatch (CSM)
A set-covering greedy algorithm is used to select representative pipelines minimizing the maximum regret between the representative and the pipelines within the set.
Our analysis also shows that parameters as denoising, sharpening, and downsampling are very important to foster diversity.
arXiv Detail & Related papers (2022-11-07T10:53:02Z) - Symmetry-aware Neural Architecture for Embodied Visual Navigation [24.83118298491349]
Experimental results show that our method increases area coverage by $8.1 m2$ when trained on the Gibson dataset and tested on the MP3D dataset.
arXiv Detail & Related papers (2021-12-17T14:07:23Z) - Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic
Uncertainty [58.144520501201995]
Bi-Lipschitz regularization of neural network layers preserve relative distances between data instances in the feature spaces of each layer.
With the use of an attentive set encoder, we propose to meta learn either diagonal or diagonal plus low-rank factors to efficiently construct task specific covariance matrices.
We also propose an inference procedure which utilizes scaled energy to achieve a final predictive distribution.
arXiv Detail & Related papers (2021-10-12T22:04:19Z) - Visual SLAM with Graph-Cut Optimized Multi-Plane Reconstruction [11.215334675788952]
This paper presents a semantic planar SLAM system that improves pose estimation and mapping using cues from an instance planar segmentation network.
While the mainstream approaches are using RGB-D sensors, employing a monocular camera with such a system still faces challenges such as robust data association and precise geometric model fitting.
arXiv Detail & Related papers (2021-08-09T18:16:08Z) - Measuring Generalization with Optimal Transport [111.29415509046886]
We develop margin-based generalization bounds, where the margins are normalized with optimal transport costs.
Our bounds robustly predict the generalization error, given training data and network parameters, on large scale datasets.
arXiv Detail & Related papers (2021-06-07T03:04:59Z) - Towards Uncovering the Intrinsic Data Structures for Unsupervised Domain
Adaptation using Structurally Regularized Deep Clustering [119.88565565454378]
Unsupervised domain adaptation (UDA) is to learn classification models that make predictions for unlabeled data on a target domain.
We propose a hybrid model of Structurally Regularized Deep Clustering, which integrates the regularized discriminative clustering of target data with a generative one.
Our proposed H-SRDC outperforms all the existing methods under both the inductive and transductive settings.
arXiv Detail & Related papers (2020-12-08T08:52:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.