Related papers: AirShot: Efficient Few-Shot Detection for Autonomous Exploration

AirShot: Efficient Few-Shot Detection for Autonomous Exploration

URL: http://arxiv.org/abs/2404.05069v1
Date: Sun, 7 Apr 2024 20:39:31 GMT
Title: AirShot: Efficient Few-Shot Detection for Autonomous Exploration
Authors: Zihan Wang, Bowen Li, Chen Wang, Sebastian Scherer,
Abstract summary: Few-shot object detection has drawn increasing attention in the field of robotic exploration, where robots are required to find unseen objects with a few online provided examples. Existing methods still face performance and efficiency challenges, mainly due to unreliable features and exhaustive class loops. We propose a new paradigm AirShot, and discover that, by fully exploiting the valuable correlation map, AirShot can result in a more robust few-shot object detection system.
Score: 18.09990606406497
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Few-shot object detection has drawn increasing attention in the field of robotic exploration, where robots are required to find unseen objects with a few online provided examples. Despite recent efforts have been made to yield online processing capabilities, slow inference speeds of low-powered robots fail to meet the demands of real-time detection-making them impractical for autonomous exploration. Existing methods still face performance and efficiency challenges, mainly due to unreliable features and exhaustive class loops. In this work, we propose a new paradigm AirShot, and discover that, by fully exploiting the valuable correlation map, AirShot can result in a more robust and faster few-shot object detection system, which is more applicable to robotics community. The core module Top Prediction Filter (TPF) can operate on multi-scale correlation maps in both the training and inference stages. During training, TPF supervises the generation of a more representative correlation map, while during inference, it reduces looping iterations by selecting top-ranked classes, thus cutting down on computational costs with better performance. Surprisingly, this dual functionality exhibits general effectiveness and efficiency on various off-the-shelf models. Exhaustive experiments on COCO2017, VOC2014, and SubT datasets demonstrate that TPF can significantly boost the efficacy and efficiency of most off-the-shelf models, achieving up to 36.4% precision improvements along with 56.3% faster inference speed. Code and Data are at: https://github.com/ImNotPrepared/AirShot.

Related papers

TSCM: A Teacher-Student Model for Vision Place Recognition Using Cross-Metric Knowledge Distillation [6.856317526681759]
Visual place recognition plays a pivotal role in autonomous exploration and navigation of mobile robots. Existing methods overcome this by exploiting powerful yet large networks. We propose a high-performance teacher and lightweight student distillation framework called TSCM.
arXiv Detail & Related papers (2024-04-02T02:29:41Z)
Cross-Cluster Shifting for Efficient and Effective 3D Object Detection in Autonomous Driving [69.20604395205248]
We present a new 3D point-based detector model, named Shift-SSD, for precise 3D object detection in autonomous driving. We introduce an intriguing Cross-Cluster Shifting operation to unleash the representation capacity of the point-based detector. We conduct extensive experiments on the KITTI, runtime, and nuScenes datasets, and the results demonstrate the state-of-the-art performance of Shift-SSD.
arXiv Detail & Related papers (2024-03-10T10:36:32Z)
DiffNAS: Bootstrapping Diffusion Models by Prompting for Better Architectures [63.12993314908957]
We propose a base model search approach, denoted "DiffNAS" We leverage GPT-4 as a supernet to expedite the search, supplemented with a search memory to enhance the results. Rigorous experimentation corroborates that our algorithm can augment the search efficiency by 2 times under GPT-based scenarios.
arXiv Detail & Related papers (2023-10-07T09:10:28Z)
Value function estimation using conditional diffusion models for control [62.27184818047923]
We propose a simple algorithm called Diffused Value Function (DVF) It learns a joint multi-step model of the environment-robot interaction dynamics using a diffusion model. We show how DVF can be used to efficiently capture the state visitation measure for multiple controllers.
arXiv Detail & Related papers (2023-06-09T18:40:55Z)
Efficient Few-Shot Object Detection via Knowledge Inheritance [62.36414544915032]
Few-shot object detection (FSOD) aims at learning a generic detector that can adapt to unseen tasks with scarce training samples. We present an efficient pretrain-transfer framework (PTF) baseline with no computational increment. We also propose an adaptive length re-scaling (ALR) strategy to alleviate the vector length inconsistency between the predicted novel weights and the pretrained base weights.
arXiv Detail & Related papers (2022-03-23T06:24:31Z)
Analysis and Adaptation of YOLOv4 for Object Detection in Aerial Images [0.0]
Our work shows the adaptation of the popular YOLOv4 framework for predicting the objects and their locations in aerial images. The trained model resulted in a mean average precision (mAP) of 45.64% with an inference speed reaching 8.7 FPS on the Tesla K80 GPU. A comparative study with several contemporary aerial object detectors proved that YOLOv4 performed better, implying a more suitable detection algorithm to incorporate on aerial platforms.
arXiv Detail & Related papers (2022-03-18T23:51:09Z)
AirDet: Few-Shot Detection without Fine-tuning for Autonomous Exploration [16.032316550612336]
We present AirDet, which is free of fine-tuning by learning class relation with support images. AirDet achieves comparable or even better results than the exhaustively finetuned methods, reaching up to 40-60% improvements on the baseline. We present evaluation results on real-world exploration tests from the DARPA Subterranean Challenge.
arXiv Detail & Related papers (2021-12-03T06:41:07Z)
Lite-FPN for Keypoint-based Monocular 3D Object Detection [18.03406686769539]
Keypoint-based monocular 3D object detection has made tremendous progress and achieved great speed-accuracy trade-off. We propose a sort of lightweight feature pyramid network called Lite-FPN to achieve multi-scale feature fusion. Our proposed method achieves significantly higher accuracy and frame rate at the same time.
arXiv Detail & Related papers (2021-05-01T14:44:31Z)
DecAug: Augmenting HOI Detection via Decomposition [54.65572599920679]
Current algorithms suffer from insufficient training samples and category imbalance within datasets. We propose an efficient and effective data augmentation method called DecAug for HOI detection. Experiments show that our method brings up to 3.3 mAP and 1.6 mAP improvements on V-COCO and HICODET dataset.
arXiv Detail & Related papers (2020-10-02T13:59:05Z)
Anchor-free Small-scale Multispectral Pedestrian Detection [88.7497134369344]
We propose a method for effective and efficient multispectral fusion of the two modalities in an adapted single-stage anchor-free base architecture. We aim at learning pedestrian representations based on object center and scale rather than direct bounding box predictions. Results show our method's effectiveness in detecting small-scaled pedestrians.
arXiv Detail & Related papers (2020-08-19T13:13:01Z)
Assembly robots with optimized control stiffness through reinforcement learning [3.4410212782758047]
We propose a methodology that uses reinforcement learning to achieve high performance in robots. The proposed method ensures the online generation of stiffness matrices that help improve the performance of local trajectory optimization. The effectiveness of the method was verified via experiments involving two contact-rich tasks.
arXiv Detail & Related papers (2020-02-27T15:54:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.