Related papers: Landmark Stereo Dataset for Landmark Recognition and Moving Node Localization in a Non-GPS Battlefield Environment

Landmark Stereo Dataset for Landmark Recognition and Moving Node Localization in a Non-GPS Battlefield Environment

URL: http://arxiv.org/abs/2402.12320v1
Date: Mon, 19 Feb 2024 17:49:23 GMT
Title: Landmark Stereo Dataset for Landmark Recognition and Moving Node Localization in a Non-GPS Battlefield Environment
Authors: Ganesh Sapkota, Sanjay Madria
Abstract summary: We propose a new strategy of using the landmark anchor instead of a radio-based anchor node to obtain the virtual coordinates of moving troops or defense forces. The proposed strategy implements landmark recognition using the Yolov5 model and landmark distance estimation using an efficient Stereo Matching Algorithm.
Score: 1.19658449368018
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we have proposed a new strategy of using the landmark anchor node instead of a radio-based anchor node to obtain the virtual coordinates (landmarkID, DISTANCE) of moving troops or defense forces that will help in tracking and maneuvering the troops along a safe path within a GPS-denied battlefield environment. The proposed strategy implements landmark recognition using the Yolov5 model and landmark distance estimation using an efficient Stereo Matching Algorithm. We consider that a moving node carrying a low-power mobile device facilitated with a calibrated stereo vision camera that captures stereo images of a scene containing landmarks within the battlefield region whose locations are stored in an offline server residing within the device itself. We created a custom landmark image dataset called MSTLandmarkv1 with 34 landmark classes and another landmark stereo dataset of those 34 landmark instances called MSTLandmarkStereov1. We trained the YOLOv5 model with MSTLandmarkv1 dataset and achieved 0.95 mAP @ 0.5 IoU and 0.767 mAP @ [0.5: 0.95] IoU. We calculated the distance from a node to the landmark utilizing the bounding box coordinates and the depth map generated by the improved SGM algorithm using MSTLandmarkStereov1. The tuple of landmark IDs obtained from the detection result and the distances calculated by the SGM algorithm are stored as the virtual coordinates of a node. In future work, we will use these virtual coordinates to obtain the location of a node using an efficient trilateration algorithm and optimize the node position using the appropriate optimization method.

Related papers

ET-Former: Efficient Triplane Deformable Attention for 3D Semantic Scene Completion From Monocular Camera [53.20087549782785]
We introduce ET-Former, a novel end-to-end algorithm for semantic scene completion using a single monocular camera. Our approach generates a semantic occupancy map from single RGB observation while simultaneously providing uncertainty estimates for semantic predictions.
arXiv Detail & Related papers (2024-10-14T19:14:49Z)
Landmark-based Localization using Stereo Vision and Deep Learning in GPS-Denied Battlefield Environment [1.19658449368018]
This paper proposes a novel framework for localization in non-GPS battlefield environments using only the passive camera sensors. The proposed method utilizes a customcalibrated stereo vision camera for distance estimation and the YOLOv8s model, which is trained and fine-tuned with our real-world dataset for landmark recognition. Experimental results demonstrate that our proposed framework performs better than existing anchorbased DV-Hop algorithms and competes with the most efficient vision-based algorithms in terms of localization error (RMSE)
arXiv Detail & Related papers (2024-02-19T21:20:56Z)
FocusTune: Tuning Visual Localization through Focus-Guided Sampling [61.79440120153917]
FocusTune is a focus-guided sampling technique to improve the performance of visual localization algorithms. We demonstrate that FocusTune both improves or matches state-of-the-art performance whilst keeping ACE's appealing low storage and compute requirements. This combination of high performance and low compute and storage requirements is particularly promising for applications in areas like mobile robotics and augmented reality.
arXiv Detail & Related papers (2023-11-06T04:58:47Z)
V-DETR: DETR with Vertex Relative Position Encoding for 3D Object Detection [73.37781484123536]
We introduce a highly performant 3D object detector for point clouds using the DETR framework. To address the limitation, we introduce a novel 3D Relative Position (3DV-RPE) method. We show exceptional results on the challenging ScanNetV2 benchmark.
arXiv Detail & Related papers (2023-08-08T17:14:14Z)
Combining geolocation and height estimation of objects from street level imagery [5.887281983256354]
We propose a pipeline for combined multi-class object geolocation and height estimation from street level RGB imagery. The proposed technique uses image metadata along with coordinates of objects detected in the image plane as found by a custom-trained Convolutional Neural Network.
arXiv Detail & Related papers (2023-05-14T19:40:02Z)
Neural Implicit Dense Semantic SLAM [83.04331351572277]
We propose a novel RGBD vSLAM algorithm that learns a memory-efficient, dense 3D geometry, and semantic segmentation of an indoor scene in an online manner. Our pipeline combines classical 3D vision-based tracking and loop closing with neural fields-based mapping. Our proposed algorithm can greatly enhance scene perception and assist with a range of robot control problems.
arXiv Detail & Related papers (2023-04-27T23:03:52Z)
Satellite Image Based Cross-view Localization for Autonomous Vehicle [59.72040418584396]
This paper shows that by using an off-the-shelf high-definition satellite image as a ready-to-use map, we are able to achieve cross-view vehicle localization up to a satisfactory accuracy. Our method is validated on KITTI and Ford Multi-AV Seasonal datasets as ground view and Google Maps as the satellite view.
arXiv Detail & Related papers (2022-07-27T13:16:39Z)
Progressive Coordinate Transforms for Monocular 3D Object Detection [52.00071336733109]
We propose a novel and lightweight approach, dubbed em Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations. In this paper, we propose a novel and lightweight approach, dubbed em Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations.
arXiv Detail & Related papers (2021-08-12T15:22:33Z)
From Voxel to Point: IoU-guided 3D Object Detection for Point Cloud with Voxel-to-Point Decoder [79.39041453836793]
We present an Intersection-over-Union (IoU) guided two-stage 3D object detector with a voxel-to-point decoder. We propose a residual voxel-to-point decoder to extract the point features in addition to the map-view features from the voxel based Region Proposal Network (RPN) We propose a simple and efficient method to align the estimated IoUs to the refined proposal boxes as a more relevant localization confidence.
arXiv Detail & Related papers (2021-08-08T14:30:13Z)
Object Tracking and Geo-localization from Street Images [4.5958644027273685]
We present a framework that detects and geolocalizes traffic signs from low frame rate street videos. The proposed system uses a modified version of RetinaNet (GPS-RetinaNet), which predicts a positional offset for each sign relative to the camera. The proposed dataset covers a diverse set of environments gathered from a broad selection of roads.
arXiv Detail & Related papers (2021-07-13T17:32:04Z)
Automated 3D cephalometric landmark identification using computerized tomography [1.4349468613117398]
Identification of 3D cephalometric landmarks that serve as proxy to the shape of human skull is the fundamental step in cephalometric analysis. Recently, automatic landmarking of 2D cephalograms using deep learning (DL) has achieved great success, but 3D landmarking for more than 80 landmarks has not yet reached a satisfactory level. This paper presents a semi-supervised DL method for 3D landmarking that takes advantage of anonymized landmark dataset with paired CT data being removed.
arXiv Detail & Related papers (2020-12-16T07:29:32Z)
Evaluation of 3D CNN Semantic Mapping for Rover Navigation [0.6882042556551609]
We present a technique to generate accurate three-dimensional semantic maps for Martian environment. The algorithm uses as input a stereo image acquired by a camera mounted on a rover. We evaluate our approach on the ESA Katwijk Beach Planetary Rover dataset.
arXiv Detail & Related papers (2020-06-17T10:24:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.