A-PixelHop: A Green, Robust and Explainable Fake-Image Detector
- URL: http://arxiv.org/abs/2111.04012v1
- Date: Sun, 7 Nov 2021 06:31:26 GMT
- Title: A-PixelHop: A Green, Robust and Explainable Fake-Image Detector
- Authors: Yao Zhu, Xinyu Wang, Hong-Shuo Chen, Ronald Salloum, C.-C. Jay Kuo
- Abstract summary: A novel method for detecting CNN-generated images, called Attentive PixelHop (or A-PixelHop), is proposed in this work.
It has three advantages: 1) low computational complexity and a small model size, 2) high detection performance against a wide range of generative models, and 3) mathematical transparency.
- Score: 27.34087987867584
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A novel method for detecting CNN-generated images, called Attentive PixelHop
(or A-PixelHop), is proposed in this work. It has three advantages: 1) low
computational complexity and a small model size, 2) high detection performance
against a wide range of generative models, and 3) mathematical transparency.
A-PixelHop is designed under the assumption that it is difficult to synthesize
high-quality, high-frequency components in local regions. It contains four
building modules: 1) selecting edge/texture blocks that contain significant
high-frequency components, 2) applying multiple filter banks to them to obtain
rich sets of spatial-spectral responses as features, 3) feeding features to
multiple binary classifiers to obtain a set of soft decisions, 4) developing an
effective ensemble scheme to fuse the soft decisions into the final decision.
Experimental results show that A-PixelHop outperforms state-of-the-art methods
in detecting CycleGAN-generated images. Furthermore, it can generalize well to
unseen generative models and datasets.
Related papers
- PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection [59.355022416218624]
integration of point and voxel representations is becoming more common in LiDAR-based 3D object detection.
We propose a novel two-stage 3D object detector, called Point-Voxel Attention Fusion Network (PVAFN)
PVAFN uses a multi-pooling strategy to integrate both multi-scale and region-specific information effectively.
arXiv Detail & Related papers (2024-08-26T19:43:01Z) - PFGS: High Fidelity Point Cloud Rendering via Feature Splatting [5.866747029417274]
We propose a novel framework to render high-quality images from sparse points.
This method first attempts to bridge the 3D Gaussian Splatting and point cloud rendering.
Experiments on different benchmarks show the superiority of our method in terms of rendering qualities and the necessities of our main components.
arXiv Detail & Related papers (2024-07-04T11:42:54Z) - Spatially Optimized Compact Deep Metric Learning Model for Similarity Search [1.0015171648915433]
Similarity search is a crucial task where spatial features decide an important output.
This study demonstrates that utilizing a single layer of involution feature extractor alongside a compact convolution model significantly enhances the performance of similarity search.
arXiv Detail & Related papers (2024-04-09T19:49:01Z) - Differentiable Registration of Images and LiDAR Point Clouds with
VoxelPoint-to-Pixel Matching [58.10418136917358]
Cross-modality registration between 2D images from cameras and 3D point clouds from LiDARs is a crucial task in computer vision and robotic training.
Previous methods estimate 2D-3D correspondences by matching point and pixel patterns learned by neural networks.
We learn a structured cross-modality matching solver to represent 3D features via a different latent pixel space.
arXiv Detail & Related papers (2023-12-07T05:46:10Z) - Complementary Frequency-Varying Awareness Network for Open-Set
Fine-Grained Image Recognition [14.450381668547259]
Open-set image recognition is a challenging topic in computer vision.
We propose a Complementary Frequency-varying Awareness Network that could better capture both high-frequency and low-frequency information.
Based on CFAN, we propose an open-set fine-grained image recognition method, called CFAN-OSFGR.
arXiv Detail & Related papers (2023-07-14T08:15:36Z) - Green Steganalyzer: A Green Learning Approach to Image Steganalysis [30.486433532000344]
Green Steganalyzer (GS) is a learning solution to image steganalysis based on the green learning paradigm.
GS consists of three modules: pixel-based anomaly prediction, 2) embedding location detection, and 3) decision fusion for image-level detection.
arXiv Detail & Related papers (2023-06-06T20:43:07Z) - BIMS-PU: Bi-Directional and Multi-Scale Point Cloud Upsampling [60.257912103351394]
We develop a new point cloud upsampling pipeline called BIMS-PU.
We decompose the up/downsampling procedure into several up/downsampling sub-steps by breaking the target sampling factor into smaller factors.
We show that our method achieves superior results to state-of-the-art approaches.
arXiv Detail & Related papers (2022-06-25T13:13:37Z) - VPFNet: Voxel-Pixel Fusion Network for Multi-class 3D Object Detection [5.12292602924464]
This paper proposes a fusion-based 3D object detection network, named Voxel-Pixel Fusion Network (VPFNet)
The proposed method is evaluated on the KITTI benchmark for multi-class 3D object detection task under multilevel difficulty.
It is shown to outperform all state-of-the-art methods in mean average precision (mAP)
arXiv Detail & Related papers (2021-11-01T14:17:09Z) - Global Filter Networks for Image Classification [90.81352483076323]
We present a conceptually simple yet computationally efficient architecture that learns long-term spatial dependencies in the frequency domain with log-linear complexity.
Our results demonstrate that GFNet can be a very competitive alternative to transformer-style models and CNNs in efficiency, generalization ability and robustness.
arXiv Detail & Related papers (2021-07-01T17:58:16Z) - Generalizing Face Forgery Detection with High-frequency Features [63.33397573649408]
Current CNN-based detectors tend to overfit to method-specific color textures and thus fail to generalize.
We propose to utilize the high-frequency noises for face forgery detection.
The first is the multi-scale high-frequency feature extraction module that extracts high-frequency noises at multiple scales.
The second is the residual-guided spatial attention module that guides the low-level RGB feature extractor to concentrate more on forgery traces from a new perspective.
arXiv Detail & Related papers (2021-03-23T08:19:21Z) - ITSELF: Iterative Saliency Estimation fLexible Framework [68.8204255655161]
Saliency object detection estimates the objects that most stand out in an image.
We propose a superpixel-based ITerative Saliency Estimation fLexible Framework (ITSELF) that allows any user-defined assumptions to be added to the model.
We compare ITSELF to two state-of-the-art saliency estimators on five metrics and six datasets.
arXiv Detail & Related papers (2020-06-30T16:51:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.