Camera Model Identification with SPAIR-Swin and Entropy based Non-Homogeneous Patches
- URL: http://arxiv.org/abs/2503.22120v1
- Date: Fri, 28 Mar 2025 03:47:28 GMT
- Title: Camera Model Identification with SPAIR-Swin and Entropy based Non-Homogeneous Patches
- Authors: Protyay Dey, Rejoy Chakraborty, Abhilasha S. Jadhav, Kapil Rana, Gaurav Sharma, Puneet Goyal,
- Abstract summary: Source camera model identification (SCMI) plays a pivotal role in image forensics with applications including authenticity verification and copyright protection.<n>We propose SPAIR-Swin, a novel model combining a modified spatial attention mechanism and inverted residual block (SPAIR) with a Swin Transformer.<n> SPAIR-Swin effectively captures both global and local features, enabling robust identification of artifacts such as noise patterns that are particularly effective for SCMI.
- Score: 8.348489591573966
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Source camera model identification (SCMI) plays a pivotal role in image forensics with applications including authenticity verification and copyright protection. For identifying the camera model used to capture a given image, we propose SPAIR-Swin, a novel model combining a modified spatial attention mechanism and inverted residual block (SPAIR) with a Swin Transformer. SPAIR-Swin effectively captures both global and local features, enabling robust identification of artifacts such as noise patterns that are particularly effective for SCMI. Additionally, unlike conventional methods focusing on homogeneous patches, we propose a patch selection strategy for SCMI that emphasizes high-entropy regions rich in patterns and textures. Extensive evaluations on four benchmark SCMI datasets demonstrate that SPAIR-Swin outperforms existing methods, achieving patch-level accuracies of 99.45%, 98.39%, 99.45%, and 97.46% and image-level accuracies of 99.87%, 99.32%, 100%, and 98.61% on the Dresden, Vision, Forchheim, and Socrates datasets, respectively. Our findings highlight that high-entropy patches, which contain high-frequency information such as edge sharpness, noise, and compression artifacts, are more favorable in improving SCMI accuracy. Code will be made available upon request.
Related papers
- PRISM: Phase-enhanced Radial-based Image Signature Mapping framework for fingerprinting AI-generated images [2.119461028150219]
We introduce PRISM, a scalable framework for fingerprinting AI-generated images.<n>We construct PRISM-36K, a novel dataset of 36,000 images generated by six text-to-image GAN- and diffusion-based models.<n> PRISM achieves an attribution accuracy of 92.04% on this dataset.
arXiv Detail & Related papers (2025-09-18T10:57:26Z) - Minimal High-Resolution Patches Are Sufficient for Whole Slide Image Representation via Cascaded Dual-Scale Reconstruction [13.897013242536849]
Whole-slide image (WSI) analysis remains challenging due to gigapixel scale and sparsely distributed diagnostic regions.<n>We propose a Cascaded Dual-Scale Reconstruction framework, demonstrating that only an average of 9 high-resolution patches per WSI are sufficient for robust slide-level representation.
arXiv Detail & Related papers (2025-08-03T08:01:30Z) - ECP-Mamba: An Efficient Multi-scale Self-supervised Contrastive Learning Method with State Space Model for PolSAR Image Classification [42.02105017671516]
This paper presents ECP-Mamba, an efficient framework integrating multi-scale self-supervised contrastive learning with a state space model (SSM) backbone.<n>On the Flevoland 1989 dataset, ECP-Mamba achieves state-of-the-art performance with an overall accuracy of 99.70%, average accuracy of 99.64% and Kappa coefficient of 99.62e-2.
arXiv Detail & Related papers (2025-06-01T14:52:54Z) - Identifying regions of interest in whole slide images of renal cell carcinoma [0.0]
This study developed an automated system to detect regions of interest (ROIs) in whole slide images of renal cell carcinoma (RCC)
The proposed approach is based on an efficient texture descriptor named dominant rotated local binary pattern (DRLBP) and color transformation to reveal and exploit the immense texture variability at the microscopic high magnifications level.
The proposed approach results revealed a very efficient image classification and demonstrated efficacy in identifying ROIs.
arXiv Detail & Related papers (2025-04-09T22:28:26Z) - End-to-End Implicit Neural Representations for Classification [57.55927378696826]
Implicit neural representations (INRs) encode a signal in neural network parameters and show excellent results for signal reconstruction.<n>INR-based classification still significantly under-performs compared to pixel-based methods like CNNs.<n>This work presents an end-to-end strategy for initializing SIRENs together with a learned learning-rate scheme.
arXiv Detail & Related papers (2025-03-23T16:02:23Z) - Hybrid Deepfake Image Detection: A Comprehensive Dataset-Driven Approach Integrating Convolutional and Attention Mechanisms with Frequency Domain Features [0.6700983301090583]
We propose an ensemble-based approach that employs three different neural network architectures for deepfake detection.<n>We empirically demonstrate the effectiveness of these models in grouping real and fake images into cohesive clusters.<n>Our weighted ensemble model achieves an excellent accuracy of 93.23% on the validation dataset of the SP Cup 2025 competition.
arXiv Detail & Related papers (2025-02-15T06:02:11Z) - A Hybrid Framework for Statistical Feature Selection and Image-Based Noise-Defect Detection [55.2480439325792]
This paper presents a hybrid framework that integrates both statistical feature selection and classification techniques to improve defect detection accuracy.<n>We present around 55 distinguished features that are extracted from industrial images, which are then analyzed using statistical methods.<n>By integrating these methods with flexible machine learning applications, the proposed framework improves detection accuracy and reduces false positives and misclassifications.
arXiv Detail & Related papers (2024-12-11T22:12:21Z) - Adaptive Residual Transformation for Enhanced Feature-Based OOD Detection in SAR Imagery [5.63530048112308]
The presence of unknown targets in real battlefield scenarios is unavoidable.
Various feature-based out-of-distribution approaches have been developed to address this issue.
We propose transforming feature-based OOD detection into a class-localized feature-residual-based approach.
arXiv Detail & Related papers (2024-11-01T00:09:02Z) - A Unified Model for Compressed Sensing MRI Across Undersampling Patterns [69.19631302047569]
Deep neural networks have shown great potential for reconstructing high-fidelity images from undersampled measurements.<n>Our model is based on neural operators, a discretization-agnostic architecture.<n>Our inference speed is also 1,400x faster than diffusion methods.
arXiv Detail & Related papers (2024-10-05T20:03:57Z) - Pixel-Inconsistency Modeling for Image Manipulation Localization [59.968362815126326]
Digital image forensics plays a crucial role in image authentication and manipulation localization.
This paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts.
Experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints.
arXiv Detail & Related papers (2023-09-30T02:54:51Z) - Searching a Compact Architecture for Robust Multi-Exposure Image Fusion [55.37210629454589]
Two major stumbling blocks hinder the development, including pixel misalignment and inefficient inference.
This study introduces an architecture search-based paradigm incorporating self-alignment and detail repletion modules for robust multi-exposure image fusion.
The proposed method outperforms various competitive schemes, achieving a noteworthy 3.19% improvement in PSNR for general scenarios and an impressive 23.5% enhancement in misaligned scenarios.
arXiv Detail & Related papers (2023-05-20T17:01:52Z) - Cascaded Cross-Attention Networks for Data-Efficient Whole-Slide Image
Classification Using Transformers [0.11219061154635457]
Whole-Slide Imaging allows for the capturing and digitization of high-resolution images of histological specimen.
transformer architecture has been proposed as a possible candidate for effectively leveraging the high-resolution information.
We propose a novel cascaded cross-attention network (CCAN) based on the cross-attention mechanism that scales linearly with the number of extracted patches.
arXiv Detail & Related papers (2023-05-11T16:42:24Z) - Enhanced Sharp-GAN For Histopathology Image Synthesis [63.845552349914186]
Histopathology image synthesis aims to address the data shortage issue in training deep learning approaches for accurate cancer detection.
We propose a novel approach that enhances the quality of synthetic images by using nuclei topology and contour regularization.
The proposed approach outperforms Sharp-GAN in all four image quality metrics on two datasets.
arXiv Detail & Related papers (2023-01-24T17:54:01Z) - Lightweight 3D Convolutional Neural Network for Schizophrenia diagnosis
using MRI Images and Ensemble Bagging Classifier [1.487444917213389]
This paper proposed a lightweight 3D convolutional neural network (CNN) based framework for schizophrenia diagnosis using MRI images.
The model achieves the highest accuracy 92.22%, sensitivity 94.44%, specificity 90%, precision 90.43%, recall 94.44%, F1-score 92.39% and G-mean 92.19% as compared to the current state-of-the-art techniques.
arXiv Detail & Related papers (2022-11-05T10:27:37Z) - SAR Despeckling using a Denoising Diffusion Probabilistic Model [52.25981472415249]
The presence of speckle degrades the image quality and adversely affects the performance of SAR image understanding applications.
We introduce SAR-DDPM, a denoising diffusion probabilistic model for SAR despeckling.
The proposed method achieves significant improvements in both quantitative and qualitative results over the state-of-the-art despeckling methods.
arXiv Detail & Related papers (2022-06-09T14:00:26Z) - Coarse-to-Fine Sparse Transformer for Hyperspectral Image Reconstruction [138.04956118993934]
We propose a novel Transformer-based method, coarse-to-fine sparse Transformer (CST)
CST embedding HSI sparsity into deep learning for HSI reconstruction.
In particular, CST uses our proposed spectra-aware screening mechanism (SASM) for coarse patch selecting. Then the selected patches are fed into our customized spectra-aggregation hashing multi-head self-attention (SAH-MSA) for fine pixel clustering and self-similarity capturing.
arXiv Detail & Related papers (2022-03-09T16:17:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.