RL-LOGO: Deep Reinforcement Learning Localization for Logo Recognition
- URL: http://arxiv.org/abs/2312.16792v1
- Date: Thu, 28 Dec 2023 02:44:28 GMT
- Title: RL-LOGO: Deep Reinforcement Learning Localization for Logo Recognition
- Authors: Masato Fujitake
- Abstract summary: This paper proposes a novel logo image recognition approach incorporating a localization technique based on reinforcement learning.
Because there is no annotation for the position coordinates, it is impossible to train and infer the location of the logo in the image.
We demonstrate that the proposed method is a promising approach to logo recognition in real-world applications.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes a novel logo image recognition approach incorporating a
localization technique based on reinforcement learning. Logo recognition is an
image classification task identifying a brand in an image. As the size and
position of a logo vary widely from image to image, it is necessary to
determine its position for accurate recognition. However, because there is no
annotation for the position coordinates, it is impossible to train and infer
the location of the logo in the image. Therefore, we propose a deep
reinforcement learning localization method for logo recognition (RL-LOGO). It
utilizes deep reinforcement learning to identify a logo region in images
without annotations of the positions, thereby improving classification
accuracy. We demonstrated a significant improvement in accuracy compared with
existing methods in several published benchmarks. Specifically, we achieved an
18-point accuracy improvement over competitive methods on the complex dataset
Logo-2K+. This demonstrates that the proposed method is a promising approach to
logo recognition in real-world applications.
Related papers
- LogoSticker: Inserting Logos into Diffusion Models for Customized Generation [73.59571559978278]
We introduce the task of logo insertion into text-to-image models.
Our goal is to insert logo identities into diffusion models and enable their seamless synthesis in varied contexts.
We present a novel two-phase pipeline LogoSticker to tackle this task.
arXiv Detail & Related papers (2024-07-18T17:54:49Z) - A Generative Approach for Wikipedia-Scale Visual Entity Recognition [56.55633052479446]
We address the task of mapping a given query image to one of the 6 million existing entities in Wikipedia.
We introduce a novel Generative Entity Recognition framework, which learns to auto-regressively decode a semantic and discriminative code'' identifying the target entity.
arXiv Detail & Related papers (2024-03-04T13:47:30Z) - Deep Homography Estimation for Visual Place Recognition [49.235432979736395]
We propose a transformer-based deep homography estimation (DHE) network.
It takes the dense feature map extracted by a backbone network as input and fits homography for fast and learnable geometric verification.
Experiments on benchmark datasets show that our method can outperform several state-of-the-art methods.
arXiv Detail & Related papers (2024-02-25T13:22:17Z) - Image-Text Pre-Training for Logo Recognition [0.27195102129094995]
We propose two novel contributions to improve the matching model's performance.
A standard paradigm of fine-tuning ImageNet pre-trained models fails to discover the text sensitivity necessary to solve the matching problem effectively.
We show that the same vision backbone pre-trained on image-text data, when fine-tuned on OpenLogoDet3K47, achieves $98.6%$ recall@1.
arXiv Detail & Related papers (2023-09-18T23:18:02Z) - A Cross-direction Task Decoupling Network for Small Logo Detection [28.505952002735334]
We creatively propose Cross-direction Task Decoupling Network (CTDNet) for small logo detection.
Comprehensive experiments on four logo datasets demonstrate the effectiveness and efficiency of the proposed method.
arXiv Detail & Related papers (2023-05-04T02:23:34Z) - Contrastive Multi-View Textual-Visual Encoding: Towards One Hundred
Thousand-Scale One-Shot Logo Identification [2.243832625209014]
We study the problem of identifying logos of business brands in natural scenes in an open-set one-shot setting.
We propose a novel multi-view textual-visual encoding framework that encodes text appearing in the logos.
We evaluate our proposed framework for cropped logo verification, cropped logo identification, and end-to-end logo identification in natural scene tasks.
arXiv Detail & Related papers (2022-11-23T12:59:41Z) - Deep Learning for Logo Detection: A Survey [59.278443852492465]
This paper reviews the advance in applying deep learning techniques to logo detection.
We perform an in-depth analysis of the existing logo detection strategies and the strengths and weaknesses of each learning strategy.
We summarize the applications of logo detection in various fields, from intelligent transportation and brand monitoring to copyright and trademark compliance.
arXiv Detail & Related papers (2022-10-10T02:07:41Z) - Multi-Label Logo Recognition and Retrieval based on Weighted Fusion of
Neural Features [6.6144185930393435]
We propose a system for the multi-label classification and similarity search of logo images.
The method allows obtaining the most similar logos on the basis of their shape, color, business sector, semantics, general characteristics.
The proposed approach is evaluated using the European Union Trademark (EUTM) dataset.
arXiv Detail & Related papers (2022-05-11T11:40:40Z) - LEAD: Self-Supervised Landmark Estimation by Aligning Distributions of
Feature Similarity [49.84167231111667]
Existing works in self-supervised landmark detection are based on learning dense (pixel-level) feature representations from an image.
We introduce an approach to enhance the learning of dense equivariant representations in a self-supervised fashion.
We show that having such a prior in the feature extractor helps in landmark detection, even under drastically limited number of annotations.
arXiv Detail & Related papers (2022-04-06T17:48:18Z) - Discriminative Semantic Feature Pyramid Network with Guided Anchoring
for Logo Detection [52.36825190893928]
We propose a novel approach, named Discriminative Semantic Feature Pyramid Network with Guided Anchoring (DSFP-GA)
Our approach mainly consists of Discriminative Semantic Feature Pyramid (DSFP) and Guided Anchoring (GA)
arXiv Detail & Related papers (2021-08-31T11:59:00Z) - Zero-Shot Recognition through Image-Guided Semantic Classification [9.291055558504588]
We present a new embedding-based framework for zero-shot learning (ZSL)
Motivated by the binary relevance method for multi-label classification, we propose to inversely learn the mapping between an image and a semantic classifier.
IGSC is conceptually simple and can be realized by a slight enhancement of an existing deep architecture for classification.
arXiv Detail & Related papers (2020-07-23T06:22:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.