A Two-Stage Dual-Path Framework for Text Tampering Detection and
Recognition
- URL: http://arxiv.org/abs/2402.13545v2
- Date: Thu, 22 Feb 2024 02:12:19 GMT
- Title: A Two-Stage Dual-Path Framework for Text Tampering Detection and
Recognition
- Authors: Guandong Li, Xian Yang, Wenpin Ma
- Abstract summary: Before the advent of deep learning, document tamper detection was difficult.
We have made some explorations in the field of text tamper detection based on deep learning.
Our Ps tamper detection method includes three steps: feature assistance, audit point positioning, and tamper recognition.
- Score: 12.639006068141528
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Document tamper detection has always been an important aspect of tamper
detection. Before the advent of deep learning, document tamper detection was
difficult. We have made some explorations in the field of text tamper detection
based on deep learning. Our Ps tamper detection method includes three steps:
feature assistance, audit point positioning, and tamper recognition. It
involves hierarchical filtering and graded output (tampered/suspected
tampered/untampered). By combining artificial tamper data features, we simulate
and augment data samples in various scenarios (cropping with noise
addition/replacement, single character/space replacement, smearing/splicing,
brightness/contrast adjustment, etc.). The auxiliary features include
exif/binary stream keyword retrieval/noise, which are used for branch detection
based on the results. Audit point positioning uses detection frameworks and
controls thresholds for high and low density detection. Tamper recognition
employs a dual-path dual-stream recognition network, with RGB and ELA stream
feature extraction. After dimensionality reduction through self-correlation
percentile pooling, the fused output is processed through vlad, yielding an
accuracy of 0.804, recall of 0.659, and precision of 0.913.
Related papers
- C2P-CLIP: Injecting Category Common Prompt in CLIP to Enhance Generalization in Deepfake Detection [98.34703790782254]
We introduce Category Common Prompt CLIP, which integrates the category common prompt into the text encoder to inject category-related concepts into the image encoder.
Our method achieves a 12.41% improvement in detection accuracy compared to the original CLIP, without introducing additional parameters during testing.
arXiv Detail & Related papers (2024-08-19T02:14:25Z) - Simplifying Two-Stage Detectors for On-Device Inference in Remote Sensing [0.7305342793164903]
We propose a model simplification method for two-stage object detectors.
Our method reduces computation costs upto 61.2% with the accuracy loss within 2.1% on the DOTAv1.5 dataset.
arXiv Detail & Related papers (2024-04-11T00:45:10Z) - Bridging the Gap Between End-to-End and Two-Step Text Spotting [88.14552991115207]
Bridging Text Spotting is a novel approach that resolves the error accumulation and suboptimal performance issues in two-step methods.
We demonstrate the effectiveness of the proposed method through extensive experiments.
arXiv Detail & Related papers (2024-04-06T13:14:04Z) - TAMPAR: Visual Tampering Detection for Parcel Logistics in Postal Supply
Chains [45.62331048595689]
In this work, we focus on the use-case last-mile delivery, where only a single RGB image is taken and compared.
We propose a tampering detection pipeline that utilizes keypoint detection to identify the eight corner points of a parcel.
Experiments with multiple classical and deep learning-based change detection approaches are performed.
arXiv Detail & Related papers (2023-11-06T14:19:05Z) - A Low-cost Strategic Monitoring Approach for Scalable and Interpretable
Error Detection in Deep Neural Networks [6.537257913467249]
We present a highly compact run-time monitoring approach for deep computer vision networks.
It can efficiently detect silent data corruption originating from both hardware memory and input faults.
arXiv Detail & Related papers (2023-10-31T10:45:55Z) - Semi-Supervised and Long-Tailed Object Detection with CascadeMatch [91.86787064083012]
We propose a novel pseudo-labeling-based detector called CascadeMatch.
Our detector features a cascade network architecture, which has multi-stage detection heads with progressive confidence thresholds.
We show that CascadeMatch surpasses existing state-of-the-art semi-supervised approaches in handling long-tailed object detection.
arXiv Detail & Related papers (2023-05-24T07:09:25Z) - Deep Spectro-temporal Artifacts for Detecting Synthesized Speech [57.42110898920759]
This paper provides an overall assessment of track 1 (Low-quality Fake Audio Detection) and track 2 (Partially Fake Audio Detection)
In this paper, spectro-temporal artifacts were detected using raw temporal signals, spectral features, as well as deep embedding features.
We ranked 4th and 5th in track 1 and track 2, respectively.
arXiv Detail & Related papers (2022-10-11T08:31:30Z) - Generalizing Face Forgery Detection with High-frequency Features [63.33397573649408]
Current CNN-based detectors tend to overfit to method-specific color textures and thus fail to generalize.
We propose to utilize the high-frequency noises for face forgery detection.
The first is the multi-scale high-frequency feature extraction module that extracts high-frequency noises at multiple scales.
The second is the residual-guided spatial attention module that guides the low-level RGB feature extractor to concentrate more on forgery traces from a new perspective.
arXiv Detail & Related papers (2021-03-23T08:19:21Z) - Dense Label Encoding for Boundary Discontinuity Free Rotation Detection [69.75559390700887]
This paper explores a relatively less-studied methodology based on classification.
We propose new techniques to push its frontier in two aspects.
Experiments and visual analysis on large-scale public datasets for aerial images show the effectiveness of our approach.
arXiv Detail & Related papers (2020-11-19T05:42:02Z) - Single-stage intake gesture detection using CTC loss and extended prefix
beam search [8.22379888383833]
Accurate detection of individual intake gestures is a key step towards automatic dietary monitoring.
We propose a single-stage approach which directly decodes the probabilities learned from sensor data into sparse intake detections.
arXiv Detail & Related papers (2020-08-07T06:04:25Z) - Sequential Drift Detection in Deep Learning Classifiers [4.022057598291766]
We utilize neural network embeddings to detect data drift by formulating the drift detection within an appropriate sequential decision framework.
We introduce a loss function which evaluates an algorithm's ability to balance these two concerns.
arXiv Detail & Related papers (2020-07-31T14:46:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.