Related papers: Large-scale Remote Sensing Image Target Recognition and Automatic Annotation

Large-scale Remote Sensing Image Target Recognition and Automatic Annotation

URL: http://arxiv.org/abs/2411.07802v1
Date: Tue, 12 Nov 2024 13:57:13 GMT
Title: Large-scale Remote Sensing Image Target Recognition and Automatic Annotation
Authors: Wuzheng Dong,
Abstract summary: This paper presents a method for object recognition and automatic labeling in large-area remote sensing images called LRSAA. The method integrates YOLOv11 and MobileNetV3-SSD object detection algorithms through ensemble learning to enhance model performance.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper presents a method for object recognition and automatic labeling in large-area remote sensing images called LRSAA. The method integrates YOLOv11 and MobileNetV3-SSD object detection algorithms through ensemble learning to enhance model performance. Furthermore, it employs Poisson disk sampling segmentation techniques and the EIOU metric to optimize the training and inference processes of segmented images, followed by the integration of results. This approach not only reduces the demand for computational resources but also achieves a good balance between accuracy and speed. The source code for this project has been made publicly available on https://github.com/anaerovane/LRSAA.

Related papers

LRSAA: Large-scale Remote Sensing Image Target Recognition and Automatic Annotation [0.0]
This paper presents a method for object recognition and automatic labeling in large-area remote sensing images called LRSAA. The method integrates YOLOv11 and MobileNetV3-SSD object detection algorithms through ensemble learning to enhance model performance.
arXiv Detail & Related papers (2024-11-24T12:30:12Z)
ESOD: Efficient Small Object Detection on High-Resolution Images [36.80623357577051]
Small objects are usually sparsely distributed and locally clustered. Massive feature extraction computations are wasted on the non-target background area of images. We propose to reuse the detector's backbone to conduct feature-level object-seeking and patch-slicing.
arXiv Detail & Related papers (2024-07-23T12:21:23Z)
LiSD: An Efficient Multi-Task Learning Framework for LiDAR Segmentation and Detection [6.813145466843275]
LiSD is a voxel-based encoder-decoder framework that addresses both segmentation and detection tasks. It achieves the state-of-the-art performance of 83.3% mIoU on the nuScenes segmentation benchmark for lidar-only methods.
arXiv Detail & Related papers (2024-06-11T07:26:54Z)
Improving Online Lane Graph Extraction by Object-Lane Clustering [106.71926896061686]
We propose an architecture and loss formulation to improve the accuracy of local lane graph estimates. The proposed method learns to assign the objects to centerlines by considering the centerlines as cluster centers. We show that our method can achieve significant performance improvements by using the outputs of existing 3D object detection methods.
arXiv Detail & Related papers (2023-07-20T15:21:28Z)
SalienDet: A Saliency-based Feature Enhancement Algorithm for Object Detection for Autonomous Driving [160.57870373052577]
We propose a saliency-based OD algorithm (SalienDet) to detect unknown objects. Our SalienDet utilizes a saliency-based algorithm to enhance image features for object proposal generation. We design a dataset relabeling approach to differentiate the unknown objects from all objects in training sample set to achieve Open-World Detection.
arXiv Detail & Related papers (2023-05-11T16:19:44Z)
De-coupling and De-positioning Dense Self-supervised Learning [65.56679416475943]
Dense Self-Supervised Learning (SSL) methods address the limitations of using image-level feature representations when handling images with multiple objects. We show that they suffer from coupling and positional bias, which arise from the receptive field increasing with layer depth and zero-padding. We demonstrate the benefits of our method on COCO and on a new challenging benchmark, OpenImage-MINI, for object classification, semantic segmentation, and object detection.
arXiv Detail & Related papers (2023-03-29T18:07:25Z)
SALISA: Saliency-based Input Sampling for Efficient Video Object Detection [58.22508131162269]
We propose SALISA, a novel non-uniform SALiency-based Input SAmpling technique for video object detection. We show that SALISA significantly improves the detection of small objects.
arXiv Detail & Related papers (2022-04-05T17:59:51Z)
Fusing Local Similarities for Retrieval-based 3D Orientation Estimation of Unseen Objects [70.49392581592089]
We tackle the task of estimating the 3D orientation of previously-unseen objects from monocular images. We follow a retrieval-based strategy and prevent the network from learning object-specific features. Our experiments on the LineMOD, LineMOD-Occluded, and T-LESS datasets show that our method yields a significantly better generalization to unseen objects than previous works.
arXiv Detail & Related papers (2022-03-16T08:53:00Z)
You Better Look Twice: a new perspective for designing accurate detectors with reduced computations [56.34005280792013]
BLT-net is a new low-computation two-stage object detection architecture. It reduces computations by separating objects from background using a very lite first-stage. Resulting image proposals are then processed in the second-stage by a highly accurate model.
arXiv Detail & Related papers (2021-07-21T12:39:51Z)
Lightweight Convolutional Neural Network with Gaussian-based Grasping Representation for Robotic Grasping Detection [4.683939045230724]
Current object detectors are difficult to strike a balance between high accuracy and fast inference speed. We present an efficient and robust fully convolutional neural network model to perform robotic grasping pose estimation. The network is an order of magnitude smaller than other excellent algorithms.
arXiv Detail & Related papers (2021-01-25T16:36:53Z)
An End-to-end Framework For Low-Resolution Remote Sensing Semantic Segmentation [0.5076419064097732]
We propose an end-to-end framework that unites a super-resolution and a semantic segmentation module. It allows the semantic segmentation network to conduct the reconstruction process, modifying the input image with helpful textures. The results show that the framework is capable of achieving a semantic segmentation performance close to native high-resolution data.
arXiv Detail & Related papers (2020-03-17T21:41:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.