A Cross-direction Task Decoupling Network for Small Logo Detection
- URL: http://arxiv.org/abs/2305.02503v1
- Date: Thu, 4 May 2023 02:23:34 GMT
- Title: A Cross-direction Task Decoupling Network for Small Logo Detection
- Authors: Hou, Sujuan and Li, Xingzhuo and Min, Weiqing and Li, Jiacheng and
Wang, Jing and Zheng, Yuanjie and Jiang, Shuqiang
- Abstract summary: We creatively propose Cross-direction Task Decoupling Network (CTDNet) for small logo detection.
Comprehensive experiments on four logo datasets demonstrate the effectiveness and efficiency of the proposed method.
- Score: 28.505952002735334
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Logo detection plays an integral role in many applications. However, handling
small logos is still difficult since they occupy too few pixels in the image,
which burdens the extraction of discriminative features. The aggregation of
small logos also brings a great challenge to the classification and
localization of logos. To solve these problems, we creatively propose
Cross-direction Task Decoupling Network (CTDNet) for small logo detection. We
first introduce Cross-direction Feature Pyramid (CFP) to realize
cross-direction feature fusion by adopting horizontal transmission and vertical
transmission. In addition, Multi-frequency Task Decoupling Head (MTDH)
decouples the classification and localization tasks into two branches. A multi
frequency attention convolution branch is designed to achieve more accurate
regression by combining discrete cosine transform and convolution creatively.
Comprehensive experiments on four logo datasets demonstrate the effectiveness
and efficiency of the proposed method.
Related papers
- LogoSticker: Inserting Logos into Diffusion Models for Customized Generation [73.59571559978278]
We introduce the task of logo insertion into text-to-image models.
Our goal is to insert logo identities into diffusion models and enable their seamless synthesis in varied contexts.
We present a novel two-phase pipeline LogoSticker to tackle this task.
arXiv Detail & Related papers (2024-07-18T17:54:49Z) - Few-Shot Object Detection with Fully Cross-Transformer [35.49840687007507]
Few-shot object detection (FSOD) aims to detect novel objects using very few training examples.
We propose a novel Fully Cross-Transformer based model (FCT) for FSOD by incorporating cross-transformer into both the feature backbone and detection head.
Our model can improve the few-shot similarity learning between the two branches by introducing the multi-level interactions.
arXiv Detail & Related papers (2022-03-28T18:28:51Z) - Discriminative Semantic Feature Pyramid Network with Guided Anchoring
for Logo Detection [52.36825190893928]
We propose a novel approach, named Discriminative Semantic Feature Pyramid Network with Guided Anchoring (DSFP-GA)
Our approach mainly consists of Discriminative Semantic Feature Pyramid (DSFP) and Guided Anchoring (GA)
arXiv Detail & Related papers (2021-08-31T11:59:00Z) - An Effective and Robust Detector for Logo Detection [58.448716977297565]
Some attackers fool the well-trained logo detection model for infringement.
A novel logo detector based on the mechanism of looking and thinking twice is proposed in this paper.
We extend detectoRS algorithm to a cascade schema with an equalization loss function, multi-scale transformations, and adversarial data augmentation.
arXiv Detail & Related papers (2021-08-01T10:17:53Z) - MRDet: A Multi-Head Network for Accurate Oriented Object Detection in
Aerial Images [51.227489316673484]
We propose an arbitrary-oriented region proposal network (AO-RPN) to generate oriented proposals transformed from horizontal anchors.
To obtain accurate bounding boxes, we decouple the detection task into multiple subtasks and propose a multi-head network.
Each head is specially designed to learn the features optimal for the corresponding task, which allows our network to detect objects accurately.
arXiv Detail & Related papers (2020-12-24T06:36:48Z) - Robust Facial Landmark Detection by Cross-order Cross-semantic Deep
Network [58.843211405385205]
We propose a cross-order cross-semantic deep network (CCDN) to boost the semantic features learning for robust facial landmark detection.
Specifically, a cross-order two-squeeze multi-excitation (CTM) module is proposed to introduce the cross-order channel correlations for more discriminative representations learning.
A novel cross-order cross-semantic (COCS) regularizer is designed to drive the network to learn cross-order cross-semantic features from different activation for facial landmark detection.
arXiv Detail & Related papers (2020-11-16T08:19:26Z) - Suppress and Balance: A Simple Gated Network for Salient Object
Detection [89.88222217065858]
We propose a simple gated network (GateNet) to solve both issues at once.
With the help of multilevel gate units, the valuable context information from the encoder can be optimally transmitted to the decoder.
In addition, we adopt the atrous spatial pyramid pooling based on the proposed "Fold" operation (Fold-ASPP) to accurately localize salient objects of various scales.
arXiv Detail & Related papers (2020-07-16T02:00:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.