DyRA: Portable Dynamic Resolution Adjustment Network for Existing Detectors
- URL: http://arxiv.org/abs/2311.17098v3
- Date: Thu, 14 Mar 2024 13:22:51 GMT
- Title: DyRA: Portable Dynamic Resolution Adjustment Network for Existing Detectors
- Authors: Daeun Seo, Hoeseok Yang, Hyungshin Kim,
- Abstract summary: This paper introduces DyRA, a dynamic resolution adjustment network providing an image-specific scale factor for existing detectors.
Loss function is devised to minimize the accuracy drop across contrasting objectives of different-sized objects for scaling.
- Score: 0.669087470775851
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Achieving constant accuracy in object detection is challenging due to the inherent variability of object sizes. One effective approach to this problem involves optimizing input resolution, referred to as a multi-resolution strategy. Previous approaches to resolution optimization have often been based on pre-defined resolutions with manual selection. However, there is a lack of study on run-time resolution optimization for existing architectures. This paper introduces DyRA, a dynamic resolution adjustment network providing an image-specific scale factor for existing detectors. This network is co-trained with detectors utilizing specially designed loss functions, namely ParetoScaleLoss and BalanceLoss. ParetoScaleLoss determines an adaptive scale factor for robustness, while BalanceLoss optimizes overall scale factors according to the localization performance of the detector. The loss function is devised to minimize the accuracy drop across contrasting objectives of different-sized objects for scaling. Our proposed network can improve accuracy across various models, including RetinaNet, Faster-RCNN, FCOS, DINO, and H-Deformable-DETR. The code is available at https://github.com/DaEunFullGrace/DyRA.git.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Scale-Invariant Object Detection by Adaptive Convolution with Unified Global-Local Context [3.061662434597098]
We propose an object detection model using a Switchable (adaptive) Atrous Convolutional Network (SAC-Net) based on the efficientDet model.
The proposed SAC-Net encapsulates the benefits of both low-level and high-level features to achieve improved performance on multi-scale object detection tasks.
Our experiments on benchmark datasets demonstrate that the proposed SAC-Net outperforms the state-of-the-art models by a significant margin in terms of accuracy.
arXiv Detail & Related papers (2024-09-17T10:08:37Z) - Depth Estimation using Weighted-loss and Transfer Learning [2.428301619698667]
We propose a simplified and adaptable approach to improve depth estimation accuracy using transfer learning and an optimized loss function.
In this study, we propose a simplified and adaptable approach to improve depth estimation accuracy using transfer learning and an optimized loss function.
The results indicate significant improvements in accuracy and robustness, with EfficientNet being the most successful architecture.
arXiv Detail & Related papers (2024-04-11T12:25:54Z) - Global Context Aggregation Network for Lightweight Saliency Detection of
Surface Defects [70.48554424894728]
We develop a Global Context Aggregation Network (GCANet) for lightweight saliency detection of surface defects on the encoder-decoder structure.
First, we introduce a novel transformer encoder on the top layer of the lightweight backbone, which captures global context information through a novel Depth-wise Self-Attention (DSA) module.
The experimental results on three public defect datasets demonstrate that the proposed network achieves a better trade-off between accuracy and running efficiency compared with other 17 state-of-the-art methods.
arXiv Detail & Related papers (2023-09-22T06:19:11Z) - Effective Invertible Arbitrary Image Rescaling [77.46732646918936]
Invertible Neural Networks (INN) are able to increase upscaling accuracy significantly by optimizing the downscaling and upscaling cycle jointly.
A simple and effective invertible arbitrary rescaling network (IARN) is proposed to achieve arbitrary image rescaling by training only one model in this work.
It is shown to achieve a state-of-the-art (SOTA) performance in bidirectional arbitrary rescaling without compromising perceptual quality in LR outputs.
arXiv Detail & Related papers (2022-09-26T22:22:30Z) - Exploring Resolution and Degradation Clues as Self-supervised Signal for
Low Quality Object Detection [77.3530907443279]
We propose a novel self-supervised framework to detect objects in degraded low resolution images.
Our methods has achieved superior performance compared with existing methods when facing variant degradation situations.
arXiv Detail & Related papers (2022-08-05T09:36:13Z) - Pyramid Grafting Network for One-Stage High Resolution Saliency
Detection [29.013012579688347]
We propose a one-stage framework called Pyramid Grafting Network (PGNet) to extract features from different resolution images independently.
An attention-based Cross-Model Grafting Module (CMGM) is proposed to enable CNN branch to combine broken detailed information more holistically.
We contribute a new Ultra-High-Resolution Saliency Detection dataset UHRSD, containing 5,920 images at 4K-8K resolutions.
arXiv Detail & Related papers (2022-04-11T12:22:21Z) - You Better Look Twice: a new perspective for designing accurate
detectors with reduced computations [56.34005280792013]
BLT-net is a new low-computation two-stage object detection architecture.
It reduces computations by separating objects from background using a very lite first-stage.
Resulting image proposals are then processed in the second-stage by a highly accurate model.
arXiv Detail & Related papers (2021-07-21T12:39:51Z) - Dynamic Resolution Network [40.64164953983429]
The redundancy on the input resolution of modern CNNs has not been fully investigated.
We propose a novel dynamic-resolution network (DRNet) in which the resolution is determined dynamically based on each input sample.
DRNet achieves similar performance with an about 34% reduction, while gains 1.4% accuracy increase with 10% reduction compared to the original ResNet-50 on ImageNet.
arXiv Detail & Related papers (2021-06-05T13:48:33Z) - Resolution Switchable Networks for Runtime Efficient Image Recognition [46.09537029831355]
We propose a general method to train a single convolutional neural network which is capable of switching image resolutions at inference.
Networks trained with the proposed method are named Resolution Switchable Networks (RS-Nets)
arXiv Detail & Related papers (2020-07-19T02:12:59Z) - Resolution Adaptive Networks for Efficient Inference [53.04907454606711]
We propose a novel Resolution Adaptive Network (RANet), which is inspired by the intuition that low-resolution representations are sufficient for classifying "easy" inputs.
In RANet, the input images are first routed to a lightweight sub-network that efficiently extracts low-resolution representations.
High-resolution paths in the network maintain the capability to recognize the "hard" samples.
arXiv Detail & Related papers (2020-03-16T16:54:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.