Object Tracking and Geo-localization from Street Images
- URL: http://arxiv.org/abs/2107.06257v1
- Date: Tue, 13 Jul 2021 17:32:04 GMT
- Title: Object Tracking and Geo-localization from Street Images
- Authors: Daniel Wilson, Thayer Alshaabi, Colin Van Oort, Xiaohan Zhang,
Jonathan Nelson, Safwan Wshah
- Abstract summary: We present a framework that detects and geolocalizes traffic signs from low frame rate street videos.
The proposed system uses a modified version of RetinaNet (GPS-RetinaNet), which predicts a positional offset for each sign relative to the camera.
The proposed dataset covers a diverse set of environments gathered from a broad selection of roads.
- Score: 4.5958644027273685
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Geo-localizing static objects from street images is challenging but also very
important for road asset mapping and autonomous driving. In this paper we
present a two-stage framework that detects and geolocalizes traffic signs from
low frame rate street videos. Our proposed system uses a modified version of
RetinaNet (GPS-RetinaNet), which predicts a positional offset for each sign
relative to the camera, in addition to performing the standard classification
and bounding box regression. Candidate sign detections from GPS-RetinaNet are
condensed into geolocalized signs by our custom tracker, which consists of a
learned metric network and a variant of the Hungarian Algorithm. Our metric
network estimates the similarity between pairs of detections, then the
Hungarian Algorithm matches detections across images using the similarity
scores provided by the metric network. Our models were trained using an updated
version of the ARTS dataset, which contains 25,544 images and 47.589 sign
annotations ~\cite{arts}. The proposed dataset covers a diverse set of
environments gathered from a broad selection of roads. Each annotaiton contains
a sign class label, its geospatial location, an assembly label, a side of road
indicator, and unique identifiers that aid in the evaluation. This dataset will
support future progress in the field, and the proposed system demonstrates how
to take advantage of some of the unique characteristics of a realistic
geolocalization dataset.
Related papers
- Neural Semantic Map-Learning for Autonomous Vehicles [85.8425492858912]
We present a mapping system that fuses local submaps gathered from a fleet of vehicles at a central instance to produce a coherent map of the road environment.
Our method jointly aligns and merges the noisy and incomplete local submaps using a scene-specific Neural Signed Distance Field.
We leverage memory-efficient sparse feature-grids to scale to large areas and introduce a confidence score to model uncertainty in scene reconstruction.
arXiv Detail & Related papers (2024-10-10T10:10:03Z) - Weakly-supervised Camera Localization by Ground-to-satellite Image Registration [52.54992898069471]
We propose a weakly supervised learning strategy for ground-to-satellite image registration.
It derives positive and negative satellite images for each ground image.
We also propose a self-supervision strategy for cross-view image relative rotation estimation.
arXiv Detail & Related papers (2024-09-10T12:57:16Z) - ALINA: Advanced Line Identification and Notation Algorithm [4.12089570007199]
Traditional labeling methods, such as crowd-sourcing, are prohibitive due to cost, data privacy, amount of time, and potential errors on large datasets.
We propose a novel annotation framework, Advanced Line Identification and Notation Algorithm (ALINA), which can be used for labeling taxiway datasets.
arXiv Detail & Related papers (2024-06-13T03:10:22Z) - GeoCLIP: Clip-Inspired Alignment between Locations and Images for
Effective Worldwide Geo-localization [61.10806364001535]
Worldwide Geo-localization aims to pinpoint the precise location of images taken anywhere on Earth.
Existing approaches divide the globe into discrete geographic cells, transforming the problem into a classification task.
We propose GeoCLIP, a novel CLIP-inspired Image-to-GPS retrieval approach that enforces alignment between the image and its corresponding GPS locations.
arXiv Detail & Related papers (2023-09-27T20:54:56Z) - G^3: Geolocation via Guidebook Grounding [92.46774241823562]
We study explicit knowledge from human-written guidebooks that describe the salient and class-discriminative visual features humans use for geolocation.
We propose the task of Geolocation via Guidebook Grounding that uses a dataset of StreetView images from a diverse set of locations.
Our approach substantially outperforms a state-of-the-art image-only geolocation method, with an improvement of over 5% in Top-1 accuracy.
arXiv Detail & Related papers (2022-11-28T16:34:40Z) - Visual Cross-View Metric Localization with Dense Uncertainty Estimates [11.76638109321532]
This work addresses visual cross-view metric localization for outdoor robotics.
Given a ground-level color image and a satellite patch that contains the local surroundings, the task is to identify the location of the ground camera within the satellite patch.
We devise a novel network architecture with denser satellite descriptors, similarity matching at the bottleneck, and a dense spatial distribution as output to capture multi-modal localization ambiguities.
arXiv Detail & Related papers (2022-08-17T20:12:23Z) - Accurate 3-DoF Camera Geo-Localization via Ground-to-Satellite Image
Matching [102.39635336450262]
We address the problem of ground-to-satellite image geo-localization by matching a query image captured at the ground level against a large-scale database with geotagged satellite images.
Our new method is able to achieve the fine-grained location of a query image, up to pixel size precision of the satellite image.
arXiv Detail & Related papers (2022-03-26T20:10:38Z) - Continuous Self-Localization on Aerial Images Using Visual and Lidar
Sensors [25.87104194833264]
We propose a novel method for geo-tracking in outdoor environments by registering a vehicle's sensor information with aerial imagery of an unseen target region.
We train a model in a metric learning setting to extract visual features from ground and aerial images.
Our method is the first to utilize on-board cameras in an end-to-end differentiable model for metric self-localization on unseen orthophotos.
arXiv Detail & Related papers (2022-03-07T12:25:44Z) - AutoGeoLabel: Automated Label Generation for Geospatial Machine Learning [69.47585818994959]
We evaluate a big data processing pipeline to auto-generate labels for remote sensing data.
We utilize the big geo-data platform IBM PAIRS to dynamically generate such labels in dense urban areas.
arXiv Detail & Related papers (2022-01-31T20:02:22Z) - Visual and Object Geo-localization: A Comprehensive Survey [11.120155713865918]
Geo-localization refers to the process of determining where on earth some entity' is located.
This paper provides a comprehensive survey of geo-localization involving images, which involves either determining from where an image has been captured (Image geo-localization) or geo-locating objects within an image (Object geo-localization)
We will provide an in-depth study, including a summary of popular algorithms, a description of proposed datasets, and an analysis of performance results to illustrate the current state of each field.
arXiv Detail & Related papers (2021-12-30T20:46:53Z) - Automatic Signboard Detection and Localization in Densely Populated
Developing Cities [0.0]
Signboard detection in natural scene images is the foremost task for error-free information retrieval.
We present a novel object detection approach that can detect signboards automatically and is suitable for such cities.
Our proposed method can detect signboards accurately (even if the images contain multiple signboards with diverse shapes and colours in a noisy background) achieving 0.90 mAP (mean average precision)
arXiv Detail & Related papers (2020-03-04T08:04:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.