Automatic Signboard Detection and Localization in Densely Populated
Developing Cities
- URL: http://arxiv.org/abs/2003.01936v4
- Date: Mon, 22 Aug 2022 15:03:46 GMT
- Title: Automatic Signboard Detection and Localization in Densely Populated
Developing Cities
- Authors: Md. Sadrul Islam Toaha, Sakib Bin Asad, Chowdhury Rafeed Rahman, S.M.
Shahriar Haque, Mahfuz Ara Proma, Md. Ahsan Habib Shuvo, Tashin Ahmed, Md.
Amimul Basher
- Abstract summary: Signboard detection in natural scene images is the foremost task for error-free information retrieval.
We present a novel object detection approach that can detect signboards automatically and is suitable for such cities.
Our proposed method can detect signboards accurately (even if the images contain multiple signboards with diverse shapes and colours in a noisy background) achieving 0.90 mAP (mean average precision)
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Most city establishments of developing cities are digitally unlabeled because
of the lack of automatic annotation systems. Hence location and trajectory
services such as Google Maps, Uber etc remain underutilized in such cities.
Accurate signboard detection in natural scene images is the foremost task for
error-free information retrieval from such city streets. Yet, developing
accurate signboard localization system is still an unresolved challenge because
of its diverse appearances that include textual images and perplexing
backgrounds. We present a novel object detection approach that can detect
signboards automatically and is suitable for such cities. We use Faster R-CNN
based localization by incorporating two specialized pretraining methods and a
run time efficient hyperparameter value selection algorithm. We have taken an
incremental approach in reaching our final proposed method through detailed
evaluation and comparison with baselines using our constructed SVSO (Street
View Signboard Objects) signboard dataset containing signboard natural scene
images of six developing countries. We demonstrate state-of-the-art performance
of our proposed method on both SVSO dataset and Open Image Dataset. Our
proposed method can detect signboards accurately (even if the images contain
multiple signboards with diverse shapes and colours in a noisy background)
achieving 0.90 mAP (mean average precision) score on SVSO independent test set.
Our implementation is available at:
https://github.com/sadrultoaha/Signboard-Detection
Related papers
- AddressCLIP: Empowering Vision-Language Models for City-wide Image Address Localization [57.34659640776723]
We propose an end-to-end framework named AddressCLIP to solve the problem with more semantics.
We have built three datasets from Pittsburgh and San Francisco on different scales specifically for the IAL problem.
arXiv Detail & Related papers (2024-07-11T03:18:53Z) - HPointLoc: Point-based Indoor Place Recognition using Synthetic RGB-D
Images [58.720142291102135]
We present a novel dataset named as HPointLoc, specially designed for exploring capabilities of visual place recognition in indoor environment.
The dataset is based on the popular Habitat simulator, in which it is possible to generate indoor scenes using both own sensor data and open datasets.
arXiv Detail & Related papers (2022-12-30T12:20:56Z) - SpaText: Spatio-Textual Representation for Controllable Image Generation [61.89548017729586]
SpaText is a new method for text-to-image generation using open-vocabulary scene control.
In addition to a global text prompt that describes the entire scene, the user provides a segmentation map.
We show its effectiveness on two state-of-the-art diffusion models: pixel-based and latent-conditional-based.
arXiv Detail & Related papers (2022-11-25T18:59:10Z) - Which country is this picture from? New data and methods for DNN-based
country recognition [33.73817899937691]
Previous works have focused mostly on the estimation of the geo-coordinates where a picture has been taken.
We introduce a new dataset, the VIPPGeo dataset, containing almost 4 million images.
We use the dataset to train a deep learning architecture casting the country recognition problem as a classification problem.
arXiv Detail & Related papers (2022-09-02T10:56:41Z) - Visual Cross-View Metric Localization with Dense Uncertainty Estimates [11.76638109321532]
This work addresses visual cross-view metric localization for outdoor robotics.
Given a ground-level color image and a satellite patch that contains the local surroundings, the task is to identify the location of the ground camera within the satellite patch.
We devise a novel network architecture with denser satellite descriptors, similarity matching at the bottleneck, and a dense spatial distribution as output to capture multi-modal localization ambiguities.
arXiv Detail & Related papers (2022-08-17T20:12:23Z) - SeeTheSeams: Localized Detection of Seam Carving based Image Forgery in
Satellite Imagery [15.127101376238418]
Seam carving is a popular technique for content aware image manipulation.
This paper proposes a novel approach for detecting and localizing seams in such images.
arXiv Detail & Related papers (2021-08-28T00:00:37Z) - Object Tracking and Geo-localization from Street Images [4.5958644027273685]
We present a framework that detects and geolocalizes traffic signs from low frame rate street videos.
The proposed system uses a modified version of RetinaNet (GPS-RetinaNet), which predicts a positional offset for each sign relative to the camera.
The proposed dataset covers a diverse set of environments gathered from a broad selection of roads.
arXiv Detail & Related papers (2021-07-13T17:32:04Z) - City-scale Scene Change Detection using Point Clouds [71.73273007900717]
We propose a method for detecting structural changes in a city using images captured from mounted cameras over two different times.
A direct comparison of the two point clouds for change detection is not ideal due to inaccurate geo-location information.
To circumvent this problem, we propose a deep learning-based non-rigid registration on the point clouds.
Experiments show that our method is able to detect scene changes effectively, even in the presence of viewpoint and illumination differences.
arXiv Detail & Related papers (2021-03-26T08:04:13Z) - Bounding Boxes Are All We Need: Street View Image Classification via
Context Encoding of Detected Buildings [7.1235778791928634]
"Detector-Encoder-Classifier" framework is proposed.
"BEAUTY" dataset can be used not only for street view image classification, but also for multi-class building detection.
arXiv Detail & Related papers (2020-10-03T08:49:51Z) - BoMuDANet: Unsupervised Adaptation for Visual Scene Understanding in
Unstructured Driving Environments [54.22535063244038]
We present an unsupervised adaptation approach for visual scene understanding in unstructured traffic environments.
Our method is designed for unstructured real-world scenarios with dense and heterogeneous traffic consisting of cars, trucks, two-and three-wheelers, and pedestrians.
arXiv Detail & Related papers (2020-09-22T08:25:44Z) - UC-Net: Uncertainty Inspired RGB-D Saliency Detection via Conditional
Variational Autoencoders [81.5490760424213]
We propose the first framework (UCNet) to employ uncertainty for RGB-D saliency detection by learning from the data labeling process.
Inspired by the saliency data labeling process, we propose probabilistic RGB-D saliency detection network.
arXiv Detail & Related papers (2020-04-13T04:12:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.