High-resolution power equipment recognition based on improved
self-attention
- URL: http://arxiv.org/abs/2311.03518v2
- Date: Thu, 7 Dec 2023 00:45:21 GMT
- Title: High-resolution power equipment recognition based on improved
self-attention
- Authors: Siyi Zhang, Cheng Liu, Xiang Li, Xin Zhai, Zhen Wei, Sizhe Li, Xun Ma
- Abstract summary: This paper introduces a novel improvement on deep self-attention networks tailored for this issue.
The proposed model comprises four key components: a foundational network, a region proposal network, a module for extracting and segmenting target areas, and a final prediction network.
The deep self-attention network's prediction mechanism uniquely incorporates the semantic context of images, resulting in substantially improved recognition performance.
- Score: 11.24310344443672
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The current trend of automating inspections at substations has sparked a
surge in interest in the field of transformer image recognition. However, due
to restrictions in the number of parameters in existing models, high-resolution
images can't be directly applied, leaving significant room for enhancing
recognition accuracy. Addressing this challenge, the paper introduces a novel
improvement on deep self-attention networks tailored for this issue. The
proposed model comprises four key components: a foundational network, a region
proposal network, a module for extracting and segmenting target areas, and a
final prediction network. The innovative approach of this paper differentiates
itself by decoupling the processes of part localization and recognition,
initially using low-resolution images for localization followed by
high-resolution images for recognition. Moreover, the deep self-attention
network's prediction mechanism uniquely incorporates the semantic context of
images, resulting in substantially improved recognition performance.
Comparative experiments validate that this method outperforms the two other
prevalent target recognition models, offering a groundbreaking perspective for
automating electrical equipment inspections.
Related papers
- Overhead Line Defect Recognition Based on Unsupervised Semantic
Segmentation [8.672676348736834]
Overhead line inspection greatly benefits from defect recognition using visible light imagery.
This paper introduces a novel defect recognition framework built on the Faster RCNN network.
arXiv Detail & Related papers (2023-11-02T03:52:59Z) - Affine-Consistent Transformer for Multi-Class Cell Nuclei Detection [76.11864242047074]
We propose a novel Affine-Consistent Transformer (AC-Former), which directly yields a sequence of nucleus positions.
We introduce an Adaptive Affine Transformer (AAT) module, which can automatically learn the key spatial transformations to warp original images for local network training.
Experimental results demonstrate that the proposed method significantly outperforms existing state-of-the-art algorithms on various benchmarks.
arXiv Detail & Related papers (2023-10-22T02:27:02Z) - Pixel-Inconsistency Modeling for Image Manipulation Localization [59.968362815126326]
Digital image forensics plays a crucial role in image authentication and manipulation localization.
This paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts.
Experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints.
arXiv Detail & Related papers (2023-09-30T02:54:51Z) - An Enhanced Low-Resolution Image Recognition Method for Traffic
Environments [3.018656336329545]
Low-resolution images suffer from small size, low quality, and lack of detail, leading to a decrease in the accuracy of traditional neural network recognition algorithms.
This paper introduces a dual-branch residual network structure that leverages the basic architecture of residual networks and a common feature subspace algorithm.
It incorporates the utilization of intermediate-layer features to enhance the accuracy of low-resolution image recognition.
arXiv Detail & Related papers (2023-09-28T12:38:31Z) - Cross-resolution Face Recognition via Identity-Preserving Network and
Knowledge Distillation [12.090322373964124]
Cross-resolution face recognition is a challenging problem for modern deep face recognition systems.
This paper proposes a new approach that enforces the network to focus on the discriminative information stored in the low-frequency components of a low-resolution image.
arXiv Detail & Related papers (2023-03-15T14:52:46Z) - ASSET: Autoregressive Semantic Scene Editing with Transformers at High
Resolutions [28.956280590967808]
Our architecture is based on a transformer with a novel attention mechanism.
Our key idea is to sparsify the transformer's attention matrix at high resolutions, guided by dense attention extracted at lower image resolutions.
We present qualitative and quantitative results, along with user studies, demonstrating the effectiveness of our method.
arXiv Detail & Related papers (2022-05-24T17:39:53Z) - Transferable Class-Modelling for Decentralized Source Attribution of
GAN-Generated Images [4.1483423188102755]
We redefine the deepfake detection and source attribution problems as a series of related binary classification tasks.
We leverage transfer learning to rapidly adapt forgery detection networks for multiple independent attribution problems.
Our models are determined via experimentation to be competitive with current benchmarks.
arXiv Detail & Related papers (2022-03-18T07:43:03Z) - Detect and Locate: A Face Anti-Manipulation Approach with Semantic and
Noise-level Supervision [67.73180660609844]
We propose a conceptually simple but effective method to efficiently detect forged faces in an image.
The proposed scheme relies on a segmentation map that delivers meaningful high-level semantic information clues about the image.
The proposed model achieves state-of-the-art detection accuracy and remarkable localization performance.
arXiv Detail & Related papers (2021-07-13T02:59:31Z) - Bayesian Attention Belief Networks [59.183311769616466]
Attention-based neural networks have achieved state-of-the-art results on a wide range of tasks.
This paper introduces Bayesian attention belief networks, which construct a decoder network by modeling unnormalized attention weights.
We show that our method outperforms deterministic attention and state-of-the-art attention in accuracy, uncertainty estimation, generalization across domains, and adversarial attacks.
arXiv Detail & Related papers (2021-06-09T17:46:22Z) - Revisiting The Evaluation of Class Activation Mapping for
Explainability: A Novel Metric and Experimental Analysis [54.94682858474711]
Class Activation Mapping (CAM) approaches provide an effective visualization by taking weighted averages of the activation maps.
We propose a novel set of metrics to quantify explanation maps, which show better effectiveness and simplify comparisons between approaches.
arXiv Detail & Related papers (2021-04-20T21:34:24Z) - Unpaired Image Enhancement with Quality-Attention Generative Adversarial
Network [92.01145655155374]
We propose a quality attention generative adversarial network (QAGAN) trained on unpaired data.
Key novelty of the proposed QAGAN lies in the injected QAM for the generator.
Our proposed method achieves better performance in both objective and subjective evaluations.
arXiv Detail & Related papers (2020-12-30T05:57:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.