DyRA: Portable Dynamic Resolution Adjustment Network for Existing Detectors
- URL: http://arxiv.org/abs/2311.17098v3
- Date: Thu, 14 Mar 2024 13:22:51 GMT
- Title: DyRA: Portable Dynamic Resolution Adjustment Network for Existing Detectors
- Authors: Daeun Seo, Hoeseok Yang, Hyungshin Kim,
- Abstract summary: This paper introduces DyRA, a dynamic resolution adjustment network providing an image-specific scale factor for existing detectors.
Loss function is devised to minimize the accuracy drop across contrasting objectives of different-sized objects for scaling.
- Score: 0.669087470775851
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Achieving constant accuracy in object detection is challenging due to the inherent variability of object sizes. One effective approach to this problem involves optimizing input resolution, referred to as a multi-resolution strategy. Previous approaches to resolution optimization have often been based on pre-defined resolutions with manual selection. However, there is a lack of study on run-time resolution optimization for existing architectures. This paper introduces DyRA, a dynamic resolution adjustment network providing an image-specific scale factor for existing detectors. This network is co-trained with detectors utilizing specially designed loss functions, namely ParetoScaleLoss and BalanceLoss. ParetoScaleLoss determines an adaptive scale factor for robustness, while BalanceLoss optimizes overall scale factors according to the localization performance of the detector. The loss function is devised to minimize the accuracy drop across contrasting objectives of different-sized objects for scaling. Our proposed network can improve accuracy across various models, including RetinaNet, Faster-RCNN, FCOS, DINO, and H-Deformable-DETR. The code is available at https://github.com/DaEunFullGrace/DyRA.git.
Related papers
- RE-POSE: Synergizing Reinforcement Learning-Based Partitioning and Offloading for Edge Object Detection [3.2805151494259563]
Real-time object detection on edge devices presents significant challenges due to their limited computational resources and the high demands of deep neural network (DNN)-based detection models.
This paper introduces RE-POSE, a framework designed to optimize the accuracy-latency trade-off in resource-constrained edge environments.
arXiv Detail & Related papers (2025-01-16T10:56:45Z) - Elastic-DETR: Making Image Resolution Learnable with Content-Specific Network Prediction [0.612477318852572]
We introduce a novel strategy for learnable resolution, called Elastic-DETR, enabling elastic utilization of multiple image resolutions.
Our network provides an adaptive scale factor based on the content of the image with a compact scale prediction module.
By leveraging the resolution's flexibility, we can demonstrate various models that exhibit varying trade-offs between accuracy and computational complexity.
arXiv Detail & Related papers (2024-12-09T09:46:21Z) - Adaptive Resolution Residual Networks -- Generalizing Across Resolutions Easily and Efficiently [7.087237546722617]
We introduce Adaptive Resolution Residual Networks (ARRNs)
ARRNs inherit the advantages of adaptive-resolution methods and the ease of use of fixed-resolution methods.
We show that ARRNs embrace the challenge posed by diverse resolutions with greater flexibility, robustness, and computational efficiency.
arXiv Detail & Related papers (2024-12-09T04:25:37Z) - Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Global Context Aggregation Network for Lightweight Saliency Detection of
Surface Defects [70.48554424894728]
We develop a Global Context Aggregation Network (GCANet) for lightweight saliency detection of surface defects on the encoder-decoder structure.
First, we introduce a novel transformer encoder on the top layer of the lightweight backbone, which captures global context information through a novel Depth-wise Self-Attention (DSA) module.
The experimental results on three public defect datasets demonstrate that the proposed network achieves a better trade-off between accuracy and running efficiency compared with other 17 state-of-the-art methods.
arXiv Detail & Related papers (2023-09-22T06:19:11Z) - Effective Invertible Arbitrary Image Rescaling [77.46732646918936]
Invertible Neural Networks (INN) are able to increase upscaling accuracy significantly by optimizing the downscaling and upscaling cycle jointly.
A simple and effective invertible arbitrary rescaling network (IARN) is proposed to achieve arbitrary image rescaling by training only one model in this work.
It is shown to achieve a state-of-the-art (SOTA) performance in bidirectional arbitrary rescaling without compromising perceptual quality in LR outputs.
arXiv Detail & Related papers (2022-09-26T22:22:30Z) - Exploring Resolution and Degradation Clues as Self-supervised Signal for
Low Quality Object Detection [77.3530907443279]
We propose a novel self-supervised framework to detect objects in degraded low resolution images.
Our methods has achieved superior performance compared with existing methods when facing variant degradation situations.
arXiv Detail & Related papers (2022-08-05T09:36:13Z) - Pyramid Grafting Network for One-Stage High Resolution Saliency
Detection [29.013012579688347]
We propose a one-stage framework called Pyramid Grafting Network (PGNet) to extract features from different resolution images independently.
An attention-based Cross-Model Grafting Module (CMGM) is proposed to enable CNN branch to combine broken detailed information more holistically.
We contribute a new Ultra-High-Resolution Saliency Detection dataset UHRSD, containing 5,920 images at 4K-8K resolutions.
arXiv Detail & Related papers (2022-04-11T12:22:21Z) - You Better Look Twice: a new perspective for designing accurate
detectors with reduced computations [56.34005280792013]
BLT-net is a new low-computation two-stage object detection architecture.
It reduces computations by separating objects from background using a very lite first-stage.
Resulting image proposals are then processed in the second-stage by a highly accurate model.
arXiv Detail & Related papers (2021-07-21T12:39:51Z) - Dynamic Resolution Network [40.64164953983429]
The redundancy on the input resolution of modern CNNs has not been fully investigated.
We propose a novel dynamic-resolution network (DRNet) in which the resolution is determined dynamically based on each input sample.
DRNet achieves similar performance with an about 34% reduction, while gains 1.4% accuracy increase with 10% reduction compared to the original ResNet-50 on ImageNet.
arXiv Detail & Related papers (2021-06-05T13:48:33Z) - Resolution Adaptive Networks for Efficient Inference [53.04907454606711]
We propose a novel Resolution Adaptive Network (RANet), which is inspired by the intuition that low-resolution representations are sufficient for classifying "easy" inputs.
In RANet, the input images are first routed to a lightweight sub-network that efficiently extracts low-resolution representations.
High-resolution paths in the network maintain the capability to recognize the "hard" samples.
arXiv Detail & Related papers (2020-03-16T16:54:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.