SRF-GAN: Super-Resolved Feature GAN for Multi-Scale Representation
- URL: http://arxiv.org/abs/2011.08459v1
- Date: Tue, 17 Nov 2020 06:27:32 GMT
- Title: SRF-GAN: Super-Resolved Feature GAN for Multi-Scale Representation
- Authors: Seong-Ho Lee and Seung-Hwan Bae
- Abstract summary: We propose a novel generator for super-resolving features of convolutional object detectors.
In this paper, we first design super-resolved feature GAN (SRF-GAN) consisting of a detection-based generator and a feature patch discriminator.
Our SRF generator can substitute for the traditional methods, and easily fine-tuned combined with other conventional detectors.
- Score: 5.634825161148483
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent convolutional object detectors exploit multi-scale feature
representations added with top-down pathway in order to detect objects at
different scales and learn stronger semantic feature responses. In general,
during the top-down feature propagation, the coarser feature maps are upsampled
to be combined with the features forwarded from bottom-up pathway, and the
combined stronger semantic features are inputs of detector's headers. However,
simple interpolation methods (e.g. nearest neighbor and bilinear) are still
used for increasing feature resolutions although they cause noisy and blurred
features. In this paper, we propose a novel generator for super-resolving
features of the convolutional object detectors. To achieve this, we first
design super-resolved feature GAN (SRF-GAN) consisting of a detection-based
generator and a feature patch discriminator. In addition, we present SRF-GAN
losses for generating the high quality of super-resolved features and improving
detection accuracy together. Our SRF generator can substitute for the
traditional interpolation methods, and easily fine-tuned combined with other
conventional detectors. To prove this, we have implemented our SRF-GAN by using
the several recent one-stage and two-stage detectors, and improved detection
accuracy over those detectors. Code is available at
https://github.com/SHLee-cv/SRF-GAN.
Related papers
- Renormalized Connection for Scale-preferred Object Detection in Satellite Imagery [51.83786195178233]
We design a Knowledge Discovery Network (KDN) to implement the renormalization group theory in terms of efficient feature extraction.
Renormalized connection (RC) on the KDN enables synergistic focusing'' of multi-scale features.
RCs extend the multi-level feature's divide-and-conquer'' mechanism of the FPN-based detectors to a wide range of scale-preferred tasks.
arXiv Detail & Related papers (2024-09-09T13:56:22Z) - Efficient Meta-Learning Enabled Lightweight Multiscale Few-Shot Object Detection in Remote Sensing Images [15.12889076965307]
YOLOv7 one-stage detector is subjected to a novel meta-learning training framework.
This transformation allows the detector to adeptly address FSOD tasks while capitalizing on its inherent advantage of lightweight.
To validate the effectiveness of our proposed detector, we conducted performance comparisons with current state-of-the-art detectors.
arXiv Detail & Related papers (2024-04-29T04:56:52Z) - STMixer: A One-Stage Sparse Action Detector [43.62159663367588]
We propose two core designs for a more flexible one-stage action detector.
First, we sparse a query-based adaptive feature sampling module, which endows the detector with the flexibility of mining a group of features from the entire video-temporal domain.
Second, we devise a decoupled feature mixing module, which dynamically attends to mixes along the spatial and temporal dimensions respectively for better feature decoding.
arXiv Detail & Related papers (2024-04-15T14:52:02Z) - Frequency-Aware Deepfake Detection: Improving Generalizability through
Frequency Space Learning [81.98675881423131]
This research addresses the challenge of developing a universal deepfake detector that can effectively identify unseen deepfake images.
Existing frequency-based paradigms have relied on frequency-level artifacts introduced during the up-sampling in GAN pipelines to detect forgeries.
We introduce a novel frequency-aware approach called FreqNet, centered around frequency domain learning, specifically designed to enhance the generalizability of deepfake detectors.
arXiv Detail & Related papers (2024-03-12T01:28:00Z) - Global Context Aggregation Network for Lightweight Saliency Detection of
Surface Defects [70.48554424894728]
We develop a Global Context Aggregation Network (GCANet) for lightweight saliency detection of surface defects on the encoder-decoder structure.
First, we introduce a novel transformer encoder on the top layer of the lightweight backbone, which captures global context information through a novel Depth-wise Self-Attention (DSA) module.
The experimental results on three public defect datasets demonstrate that the proposed network achieves a better trade-off between accuracy and running efficiency compared with other 17 state-of-the-art methods.
arXiv Detail & Related papers (2023-09-22T06:19:11Z) - Towards Efficient Use of Multi-Scale Features in Transformer-Based
Object Detectors [49.83396285177385]
Multi-scale features have been proven highly effective for object detection but often come with huge and even prohibitive extra computation costs.
We propose Iterative Multi-scale Feature Aggregation (IMFA) -- a generic paradigm that enables efficient use of multi-scale features in Transformer-based object detectors.
arXiv Detail & Related papers (2022-08-24T08:09:25Z) - Integral Migrating Pre-trained Transformer Encoder-decoders for Visual
Object Detection [78.2325219839805]
imTED improves the state-of-the-art of few-shot object detection by up to 7.6% AP.
Experiments on MS COCO dataset demonstrate that imTED consistently outperforms its counterparts by 2.8%.
arXiv Detail & Related papers (2022-05-19T15:11:20Z) - Generalizing Face Forgery Detection with High-frequency Features [63.33397573649408]
Current CNN-based detectors tend to overfit to method-specific color textures and thus fail to generalize.
We propose to utilize the high-frequency noises for face forgery detection.
The first is the multi-scale high-frequency feature extraction module that extracts high-frequency noises at multiple scales.
The second is the residual-guided spatial attention module that guides the low-level RGB feature extractor to concentrate more on forgery traces from a new perspective.
arXiv Detail & Related papers (2021-03-23T08:19:21Z) - Small-Object Detection in Remote Sensing Images with End-to-End
Edge-Enhanced GAN and Object Detector Network [9.135036713000513]
A generative adversarial network (GAN)-based model called enhanced super-resolution GAN (ESRGAN) shows remarkable image enhancement performance.
We propose a new edge-enhanced super-resolution GAN (EESRGAN) to improve the image quality of remote sensing images.
arXiv Detail & Related papers (2020-03-20T03:07:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.