Condensing Two-stage Detection with Automatic Object Key Part Discovery
- URL: http://arxiv.org/abs/2006.05597v3
- Date: Thu, 13 Aug 2020 01:08:50 GMT
- Title: Condensing Two-stage Detection with Automatic Object Key Part Discovery
- Authors: Zhe Chen, Jing Zhang, Dacheng Tao
- Abstract summary: Two-stage object detectors generally require excessively large models for their detection heads to achieve high accuracy.
We propose that the model parameters of two-stage detection heads can be condensed and reduced by concentrating on object key parts.
Our proposed technique consistently maintains original performance while waiving around 50% of the model parameters of common two-stage detection heads.
- Score: 87.1034745775229
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern two-stage object detectors generally require excessively large models
for their detection heads to achieve high accuracy. To address this problem, we
propose that the model parameters of two-stage detection heads can be condensed
and reduced by concentrating on object key parts. To this end, we first
introduce an automatic object key part discovery task to make neural networks
discover representative sub-parts in each foreground object. With these
discovered key parts, we then decompose the object appearance modeling into a
key part modeling process and a global modeling process for detection. Key part
modeling encodes fine and detailed features from the discovered key parts, and
global modeling encodes rough and holistic object characteristics. In practice,
such decomposition allows us to significantly abridge model parameters without
sacrificing much detection accuracy. Experiments on popular datasets illustrate
that our proposed technique consistently maintains original performance while
waiving around 50% of the model parameters of common two-stage detection heads,
with the performance only deteriorating by 1.5% when waiving around 96% of the
original model parameters. Codes are released on:
https://github.com/zhechen/Condensing2stageDetection.
Related papers
- Open-Set Deepfake Detection: A Parameter-Efficient Adaptation Method with Forgery Style Mixture [58.60915132222421]
We introduce an approach that is both general and parameter-efficient for face forgery detection.
We design a forgery-style mixture formulation that augments the diversity of forgery source domains.
We show that the designed model achieves state-of-the-art generalizability with significantly reduced trainable parameters.
arXiv Detail & Related papers (2024-08-23T01:53:36Z) - Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - D\'etection d'Objets dans les documents num\'eris\'es par r\'eseaux de
neurones profonds [0.0]
We study multiple tasks related to document layout analysis such as the detection of text lines, the splitting into acts or the detection of the writing support.
We propose two deep neural models following two different approaches.
arXiv Detail & Related papers (2023-01-27T14:45:45Z) - Rethinking the Detection Head Configuration for Traffic Object Detection [11.526701794026641]
We propose a lightweight traffic object detection network based on matching between detection head and object distribution.
The proposed model achieves more competitive performance than other models on BDD100K dataset and our proposed ETFOD-v2 dataset.
arXiv Detail & Related papers (2022-10-08T02:23:57Z) - When Liebig's Barrel Meets Facial Landmark Detection: A Practical Model [87.25037167380522]
We propose a model that is accurate, robust, efficient, generalizable, and end-to-end trainable.
In order to achieve a better accuracy, we propose two lightweight modules.
DQInit dynamically initializes the queries of decoder from the inputs, enabling the model to achieve as good accuracy as the ones with multiple decoder layers.
QAMem is designed to enhance the discriminative ability of queries on low-resolution feature maps by assigning separate memory values to each query rather than a shared one.
arXiv Detail & Related papers (2021-05-27T13:51:42Z) - An Improvement of Object Detection Performance using Multi-step Machine
Learnings [0.0]
This paper describes an enhancement of object detection based on a multi-step concept, where a post-processing step called the calibration model is introduced.
The calibration model consists of a convolutional neural network, and utilizes rich contextual information based on the domain knowledge of the input.
arXiv Detail & Related papers (2021-01-19T11:32:27Z) - Slender Object Detection: Diagnoses and Improvements [74.40792217534]
In this paper, we are concerned with the detection of a particular type of objects with extreme aspect ratios, namely textbfslender objects.
For a classical object detection method, a drastic drop of $18.9%$ mAP on COCO is observed, if solely evaluated on slender objects.
arXiv Detail & Related papers (2020-11-17T09:39:42Z) - Attention-based Joint Detection of Object and Semantic Part [4.389917490809522]
Our model is created on top of two Faster-RCNN models that share their features to get enhanced representations of both.
Experiments on the PASCAL-Part 2010 dataset show that joint detection can simultaneously improve both object detection and part detection.
arXiv Detail & Related papers (2020-07-05T18:54:10Z) - End-to-End Object Detection with Transformers [88.06357745922716]
We present a new method that views object detection as a direct set prediction problem.
Our approach streamlines the detection pipeline, effectively removing the need for many hand-designed components.
The main ingredients of the new framework, called DEtection TRansformer or DETR, are a set-based global loss.
arXiv Detail & Related papers (2020-05-26T17:06:38Z) - Distillation of neural network models for detection and description of
key points of images [0.0]
The aim of this study is to obtain a more compact model of detection and description of key points.
A new data set has been introduced for testing key point detection methods and a new quality indicator of the allocated key points.
A new model with a significantly smaller number of parameters shows the accuracy of point matching close to the accuracy of the original model.
arXiv Detail & Related papers (2020-05-18T18:59:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.