Related papers: LEAP:D - A Novel Prompt-based Approach for Domain-Generalized Aerial Object Detection

LEAP:D - A Novel Prompt-based Approach for Domain-Generalized Aerial Object Detection

URL: http://arxiv.org/abs/2411.09180v1
Date: Thu, 14 Nov 2024 04:39:10 GMT
Title: LEAP:D - A Novel Prompt-based Approach for Domain-Generalized Aerial Object Detection
Authors: Chanyeong Park, Heegwang Kim, Joonki Paik,
Abstract summary: We introduce an innovative vision-language approach using learnable prompts. This shift from conventional manual prompts aims to reduce domain-specific knowledge interference. We streamline the training process with a one-step approach, updating the learnable prompt concurrently with model training.
Score: 2.1233286062376497
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Drone-captured images present significant challenges in object detection due to varying shooting conditions, which can alter object appearance and shape. Factors such as drone altitude, angle, and weather cause these variations, influencing the performance of object detection algorithms. To tackle these challenges, we introduce an innovative vision-language approach using learnable prompts. This shift from conventional manual prompts aims to reduce domain-specific knowledge interference, ultimately improving object detection capabilities. Furthermore, we streamline the training process with a one-step approach, updating the learnable prompt concurrently with model training, enhancing efficiency without compromising performance. Our study contributes to domain-generalized object detection by leveraging learnable prompts and optimizing training processes. This enhances model robustness and adaptability across diverse environments, leading to more effective aerial object detection.

Related papers

Feature Based Methods in Domain Adaptation for Object Detection: A Review Paper [0.6437284704257459]
Domain adaptation aims to enhance the performance of machine learning models when deployed in target domains with distinct data distributions. This review delves into advanced methodologies for domain adaptation, including adversarial learning, discrepancy-based, multi-domain, teacher-student, ensemble, and Vision Language Models. Special attention is given to strategies that minimize the reliance on extensive labeled data, particularly in scenarios involving synthetic-to-real domain shifts.
arXiv Detail & Related papers (2024-12-23T06:34:23Z)
Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models [68.90917438865078]
Deepfake techniques for facial synthesis and editing pose serious risks for generative models. In this paper, we investigate how detection performance varies across model backbones, types, and datasets. We introduce Contrastive Blur, which enhances performance on facial images, and MINDER, which addresses noise type bias, balancing performance across domains.
arXiv Detail & Related papers (2024-11-28T13:04:45Z)
Underwater Object Detection in the Era of Artificial Intelligence: Current, Challenge, and Future [119.88454942558485]
Underwater object detection (UOD) aims to identify and localise objects in underwater images or videos. In recent years, artificial intelligence (AI) based methods, especially deep learning methods, have shown promising performance in UOD.
arXiv Detail & Related papers (2024-10-08T00:25:33Z)
A Simple Background Augmentation Method for Object Detection with Diffusion Model [53.32935683257045]
In computer vision, it is well-known that a lack of data diversity will impair model performance. We propose a simple yet effective data augmentation approach by leveraging advancements in generative models. Background augmentation, in particular, significantly improves the models' robustness and generalization capabilities.
arXiv Detail & Related papers (2024-08-01T07:40:00Z)
Learning Background Prompts to Discover Implicit Knowledge for Open Vocabulary Object Detection [101.15777242546649]
Open vocabulary object detection (OVD) aims at seeking an optimal object detector capable of recognizing objects from both base and novel categories. Recent advances leverage knowledge distillation to transfer insightful knowledge from pre-trained large-scale vision-language models to the task of object detection. We present a novel OVD framework termed LBP to propose learning background prompts to harness explored implicit background knowledge.
arXiv Detail & Related papers (2024-06-01T17:32:26Z)
Active Object Detection with Knowledge Aggregation and Distillation from Large Models [5.669106489320257]
Accurately detecting active objects undergoing state changes is essential for comprehending human interactions and facilitating decision-making. The existing methods for active object detection (AOD) primarily rely on visual appearance of the objects within input, such as changes in size, shape and relationship with hands. We observe that the state changes are often the result of an interaction being performed upon the object, thus propose to use informed priors about object related plausible interactions to provide more reliable cues for AOD. Our proposed framework achieves state-of-the-art performance on four datasets, namely Ego4D, Epic-Kitchens, MECCANO
arXiv Detail & Related papers (2024-05-21T05:39:31Z)
Zero-shot Degree of Ill-posedness Estimation for Active Small Object Change Detection [8.977792536037956]
In everyday indoor navigation, robots often needto detect non-distinctive small-change objects. Existing techniques rely on high-quality class-specific object priors to regularize a change detector model. In this study, we explore the concept of degree-of-ill-posedness (DoI) to improve both passive and activevision.
arXiv Detail & Related papers (2024-05-10T01:56:39Z)
Deep Active Perception for Object Detection using Navigation Proposals [39.52573252842573]
We propose a generic supervised active perception pipeline for object detection. It can be trained using existing off-the-shelf object detectors, while also leveraging advances in simulation environments. The proposed method was evaluated on synthetic datasets, constructed within the Webots robotics simulator.
arXiv Detail & Related papers (2023-12-15T20:55:52Z)
Lifelong Change Detection: Continuous Domain Adaptation for Small Object Change Detection in Every Robot Navigation [5.8010446129208155]
Ground view change detection suffers from its ill-posed-ness because of visual uncertainty combined with complex nonlinear perspective projection. To regularize the ill-posed-ness, the commonly applied supervised learning methods rely on manually annotated high-quality object-class-specific priors. The present approach adopts the powerful and versatile idea that object changes detected during everyday robot navigation can be reused as additional priors to improve future change detection tasks.
arXiv Detail & Related papers (2023-06-28T10:34:59Z)
Cycle Consistency Driven Object Discovery [75.60399804639403]
We introduce a method that explicitly optimize the constraint that each object in a scene should be associated with a distinct slot. By integrating these consistency objectives into various existing slot-based object-centric methods, we showcase substantial improvements in object-discovery performance. Our results suggest that the proposed approach not only improves object discovery, but also provides richer features for downstream tasks.
arXiv Detail & Related papers (2023-06-03T21:49:06Z)
Adversarially-Aware Robust Object Detector [85.10894272034135]
We propose a Robust Detector (RobustDet) based on adversarially-aware convolution to disentangle gradients for model learning on clean and adversarial images. Our model effectively disentangles gradients and significantly enhances the detection robustness with maintaining the detection ability on clean images.
arXiv Detail & Related papers (2022-07-13T13:59:59Z)
Object Detection and Recognition of Swap-Bodies using Camera mounted on a Vehicle [13.702911401489427]
This project aims to jointly perform object detection of a swap-body and to find the type of swap-body by reading an ILU code. Recent research activities have drastically improved deep learning techniques which proves to enhance the field of computer vision.
arXiv Detail & Related papers (2020-04-17T08:49:54Z)
Incremental Object Detection via Meta-Learning [77.55310507917012]
We propose a meta-learning approach that learns to reshape model gradients, such that information across incremental tasks is optimally shared. In comparison to existing meta-learning methods, our approach is task-agnostic, allows incremental addition of new-classes and scales to high-capacity models for object detection.
arXiv Detail & Related papers (2020-03-17T13:40:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.