Unsupervised Ultra-High-Resolution UAV Low-Light Image Enhancement: A Benchmark, Metric and Framework
- URL: http://arxiv.org/abs/2509.01373v1
- Date: Mon, 01 Sep 2025 11:20:07 GMT
- Title: Unsupervised Ultra-High-Resolution UAV Low-Light Image Enhancement: A Benchmark, Metric and Framework
- Authors: Wei Lu, Lingyu Zhu, Si-Bao Chen,
- Abstract summary: Low light conditions significantly degrade Unmanned Aerial Vehicles (UAVs) performance in critical applications.<n>Existing Low-light Image Enhancement (LIE) methods struggle with the unique challenges of aerial imagery.<n>We present U3D, the first unsupervised UHR UAV dataset for LIE, with a unified evaluation toolkit.<n>Second, we introduce the Edge Efficiency Index (EEI), a novel metric balancing perceptual quality with key deployment factors.
- Score: 9.515570339229962
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Low light conditions significantly degrade Unmanned Aerial Vehicles (UAVs) performance in critical applications. Existing Low-light Image Enhancement (LIE) methods struggle with the unique challenges of aerial imagery, including Ultra-High Resolution (UHR), lack of paired data, severe non-uniform illumination, and deployment constraints. To address these issues, we propose three key contributions. First, we present U3D, the first unsupervised UHR UAV dataset for LIE, with a unified evaluation toolkit. Second, we introduce the Edge Efficiency Index (EEI), a novel metric balancing perceptual quality with key deployment factors: speed, resolution, model complexity, and memory footprint. Third, we develop U3LIE, an efficient framework with two training-only designs-Adaptive Pre-enhancement Augmentation (APA) for input normalization and a Luminance Interval Loss (L_int) for exposure control. U3LIE achieves SOTA results, processing 4K images at 23.8 FPS on a single GPU, making it ideal for real-time on-board deployment. In summary, these contributions provide a holistic solution (dataset, metric, and method) for advancing robust 24/7 UAV vision. The code and datasets are available at https://github.com/lwCVer/U3D_Toolkit.
Related papers
- A Tri-Modal Dataset and a Baseline System for Tracking Unmanned Aerial Vehicles [74.8162337823142]
MM-UAV is the first large-scale benchmark for Multi-Modal UAV Tracking.<n>The dataset spans over 30 challenging scenarios, with 1,321 synchronised multi-modal sequences, and more than 2.8 million annotated frames.<n>Accompanying the dataset, we provide a novel multi-modal multi-UAV tracking framework.
arXiv Detail & Related papers (2025-11-23T08:42:17Z) - Light-Weight Cross-Modal Enhancement Method with Benchmark Construction for UAV-based Open-Vocabulary Object Detection [6.443926939309045]
We propose a complete UAV-oriented solution that combines both dataset construction and model innovation.<n>First, we design a refined UAV-Label Engine, which efficiently resolves annotation redundancy, inconsistency, and ambiguity.<n>Second, we introduce the Cross-Attention Gated Enhancement (CAGE) module, a lightweight dual-path fusion design that integrates cross-attention, adaptive gating, and global FiLM modulation for robust textvision alignment.
arXiv Detail & Related papers (2025-09-07T10:59:02Z) - Towards Lightest Low-Light Image Enhancement Architecture for Mobile Devices [3.7651572719063178]
Real-time low-light image enhancement on mobile and embedded devices requires models that balance visual quality and computational efficiency.<n>We propose LiteIE, an ultra-lightweight unsupervised enhancement framework that eliminates dependence on large-scale supervision.<n> LiteIE runs at 30 FPS for 4K images with just 58 parameters, enabling real-time deployment on edge devices.
arXiv Detail & Related papers (2025-07-06T07:36:47Z) - One RL to See Them All: Visual Triple Unified Reinforcement Learning [92.90120580989839]
We propose V-Triune, a Visual Triple Unified Reinforcement Learning system that enables visual reasoning and perception tasks within a single training pipeline.<n>V-Triune comprises triple complementary components: Sample-Level Datashelf (to unify diverse task inputs), Verifier-Level Reward (to deliver custom rewards via specialized verifiers).<n>We introduce a novel Dynamic IoU reward, which provides adaptive, progressive, and definite feedback for perception tasks handled by V-Triune.
arXiv Detail & Related papers (2025-05-23T17:41:14Z) - Multi-Knowledge-oriented Nighttime Haze Imaging Enhancer for Vision-driven Intelligent Systems [4.742689734374541]
Adverse imaging conditions such as haze severely degrade image quality.<n>We propose a multi-knowledge-oriented nighttime haze imaging enhancer (MKoIE)<n>MKoIE integrates three tasks: daytime dehazing, low-light enhancement, and nighttime dehazing.
arXiv Detail & Related papers (2025-02-11T08:22:21Z) - RemDet: Rethinking Efficient Model Design for UAV Object Detection [12.652666443395528]
Object detection in Unmanned Aerial Vehicle (UAV) images has emerged as a focal area of research.<n>Current real-time object detectors are not optimized for UAV images.<n>We propose a novel detector, RemDet, to address these challenges.
arXiv Detail & Related papers (2024-12-13T11:00:57Z) - Enhancing Nighttime UAV Tracking with Light Distribution Suppression [6.950880335490385]
This work proposes a novel enhancer, i.e., LDEnhancer, enhancing nighttime UAV tracking with light distribution suppression.
Specifically, a novel image content refinement module is developed to decompose the light distribution information and image content information.
A challenging nighttime UAV tracking dataset with uneven light distribution, namely NAT2024-2, is constructed to provide a comprehensive evaluation.
arXiv Detail & Related papers (2024-09-25T05:19:35Z) - 4D Contrastive Superflows are Dense 3D Representation Learners [62.433137130087445]
We introduce SuperFlow, a novel framework designed to harness consecutive LiDAR-camera pairs for establishing pretraining objectives.
To further boost learning efficiency, we incorporate a plug-and-play view consistency module that enhances alignment of the knowledge distilled from camera views.
arXiv Detail & Related papers (2024-07-08T17:59:54Z) - Ultra-High-Definition Low-Light Image Enhancement: A Benchmark and
Transformer-Based Method [51.30748775681917]
We consider the task of low-light image enhancement (LLIE) and introduce a large-scale database consisting of images at 4K and 8K resolution.
We conduct systematic benchmarking studies and provide a comparison of current LLIE algorithms.
As a second contribution, we introduce LLFormer, a transformer-based low-light enhancement method.
arXiv Detail & Related papers (2022-12-22T09:05:07Z) - Benchmarking the Robustness of LiDAR-Camera Fusion for 3D Object
Detection [58.81316192862618]
Two critical sensors for 3D perception in autonomous driving are the camera and the LiDAR.
fusing these two modalities can significantly boost the performance of 3D perception models.
We benchmark the state-of-the-art fusion methods for the first time.
arXiv Detail & Related papers (2022-05-30T09:35:37Z) - Activation to Saliency: Forming High-Quality Labels for Unsupervised
Salient Object Detection [54.92703325989853]
We propose a two-stage Activation-to-Saliency (A2S) framework that effectively generates high-quality saliency cues.
No human annotations are involved in our framework during the whole training process.
Our framework reports significant performance compared with existing USOD methods.
arXiv Detail & Related papers (2021-12-07T11:54:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.