Related papers: A biologically inspired separable learning vision model for real-time traffic object perception in Dark

A biologically inspired separable learning vision model for real-time traffic object perception in Dark

URL: http://arxiv.org/abs/2509.05012v1
Date: Fri, 05 Sep 2025 11:22:52 GMT
Title: A biologically inspired separable learning vision model for real-time traffic object perception in Dark
Authors: Hulin Li, Qiliang Ren, Jun Li, Hanbing Wei, Zheng Liu, Linfang Fan,
Abstract summary: We introduce a physically grounded illumination degradation method tailored to real-world low-light settings and construct Dark-traffic, the largest densely annotated dataset to date for low-light traffic scenes.<n>We also propose the Separable Learning Vision Model (SLVM), a biologically inspired framework designed to enhance perception under adverse lighting.<n>SLVM integrates four key components: a light-adaptive pupillary mechanism for illumination-sensitive feature extraction, a feature-level separable learning strategy for efficient representation, and a spatial misalignment-aware fusion module for precise multi-feature alignment.
Score: 8.798037910488812
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Fast and accurate object perception in low-light traffic scenes has attracted increasing attention. However, due to severe illumination degradation and the lack of reliable visual cues, existing perception models and methods struggle to quickly adapt to and accurately predict in low-light environments. Moreover, there is the absence of available large-scale benchmark specifically focused on low-light traffic scenes. To bridge this gap, we introduce a physically grounded illumination degradation method tailored to real-world low-light settings and construct Dark-traffic, the largest densely annotated dataset to date for low-light traffic scenes, supporting object detection, instance segmentation, and optical flow estimation. We further propose the Separable Learning Vision Model (SLVM), a biologically inspired framework designed to enhance perception under adverse lighting. SLVM integrates four key components: a light-adaptive pupillary mechanism for illumination-sensitive feature extraction, a feature-level separable learning strategy for efficient representation, task-specific decoupled branches for multi-task separable learning, and a spatial misalignment-aware fusion module for precise multi-feature alignment. Extensive experiments demonstrate that SLVM achieves state-of-the-art performance with reduced computational overhead. Notably, it outperforms RT-DETR by 11.2 percentage points in detection, YOLOv12 by 6.1 percentage points in instance segmentation, and reduces endpoint error (EPE) of baseline by 12.37% on Dark-traffic. On the LIS benchmark, the end-to-end trained SLVM surpasses Swin Transformer+EnlightenGAN and ConvNeXt-T+EnlightenGAN by an average of 11 percentage points across key metrics, and exceeds Mask RCNN (with light enhancement) by 3.1 percentage points. The Dark-traffic dataset and complete code is released at https://github.com/alanli1997/slvm.

Related papers

SAIGFormer: A Spatially-Adaptive Illumination-Guided Network for Low-Light Image Enhancement [58.79901582809091]
Recent Transformer-based low-light enhancement methods have made promising progress in recovering global illumination.<n>Recent Transformer-based low-light enhancement methods have made promising progress in recovering global illumination.<n>We present a Spatially-Adaptive Illumination-Guided Transformer framework that enables accurate illumination restoration.
arXiv Detail & Related papers (2025-07-21T11:38:56Z)
OwlSight: A Robust Illumination Adaptation Framework for Dark Video Human Action Recognition [19.035892288559975]
We propose OwlSight, a biomimetic-inspired framework with whole-stage illumination enhancement to interact with classification action for accurate dark video human action recognition.<n>We build Dark-101, a large-scale dataset comprising 18,310 dark videos across 101 action categories, significantly surpassing existing datasets in scale and diversity.<n> Notably, it outperforms previous best approaches by 5.36% on ARID1.5 and 1.72% on Dark-101, highlighting its effectiveness in challenging dark environments.
arXiv Detail & Related papers (2025-03-30T00:54:22Z)
BRIGHT-VO: Brightness-Guided Hybrid Transformer for Visual Odometry with Multi-modality Refinement Module [11.898515581215708]
Visual odometry (VO) plays a crucial role in autonomous driving, robotic navigation, and other related tasks.<n>We introduce BrightVO, a novel VO model based on Transformer architecture, which performs front-end visual feature extraction.<n>Using pose graph optimization, this module iteratively refines pose estimates to reduce errors and improve both accuracy and robustness.
arXiv Detail & Related papers (2025-01-15T08:50:52Z)
Point Cloud Understanding via Attention-Driven Contrastive Learning [64.65145700121442]
Transformer-based models have advanced point cloud understanding by leveraging self-attention mechanisms. PointACL is an attention-driven contrastive learning framework designed to address these limitations. Our method employs an attention-driven dynamic masking strategy that guides the model to focus on under-attended regions.
arXiv Detail & Related papers (2024-11-22T05:41:00Z)
DAP-LED: Learning Degradation-Aware Priors with CLIP for Joint Low-light Enhancement and Deblurring [14.003870853594972]
We propose a novel transformer-based joint learning framework, named DAP-LED. It can jointly achieve low-light enhancement and deblurring, benefiting downstream tasks, such as depth estimation, segmentation, and detection in the dark. The key insight is to leverage CLIP to adaptively learn the degradation levels from images at night.
arXiv Detail & Related papers (2024-09-20T13:37:53Z)
EvLight++: Low-Light Video Enhancement with an Event Camera: A Large-Scale Real-World Dataset, Novel Method, and More [7.974102031202597]
EvLight++ is a novel event-guided low-light video enhancement approach designed for robust performance in real-world scenarios. EvLight++ significantly outperforms both single image- and video-based methods by 1.37 dB and 3.71 dB, respectively.
arXiv Detail & Related papers (2024-08-29T04:30:31Z)
A Lightweight Low-Light Image Enhancement Network via Channel Prior and Gamma Correction [0.0]
Low-light image enhancement (LLIE) refers to image enhancement technology tailored to handle low-light scenes. We introduce CPGA-Net, an innovative LLIE network that combines dark/bright channel priors and gamma correction via deep learning.
arXiv Detail & Related papers (2024-02-28T08:18:20Z)
Low-Light Hyperspectral Image Enhancement [90.84144276935464]
This work focuses on the low-light HSI enhancement task, which aims to reveal the spatial-spectral information hidden in darkened areas. Based on Laplacian pyramid decomposition and reconstruction, we developed an end-to-end data-driven low-light HSI enhancement (HSIE) approach. The effectiveness and efficiency of HSIE both in quantitative assessment measures and visual effects are demonstrated.
arXiv Detail & Related papers (2022-08-05T08:45:52Z)
Multitask AET with Orthogonal Tangent Regularity for Dark Object Detection [84.52197307286681]
We propose a novel multitask auto encoding transformation (MAET) model to enhance object detection in a dark environment. In a self-supervision manner, the MAET learns the intrinsic visual structure by encoding and decoding the realistic illumination-degrading transformation. We have achieved the state-of-the-art performance using synthetic and real-world datasets.
arXiv Detail & Related papers (2022-05-06T16:27:14Z)
LightDefectNet: A Highly Compact Deep Anti-Aliased Attention Condenser Neural Network Architecture for Light Guide Plate Surface Defect Detection [71.40595908386477]
An essential step in the manufacturing of light guide plates is the quality inspection of defects such as scratches, bright/dark spots, and impurities. Advances in deep learning-driven computer vision has led to the exploration of automated visual quality inspection of light guide plates. LightDetectNet is a highly compact deep anti-aliased attention condenser neural network architecture tailored specifically for light guide plate surface defect detection.
arXiv Detail & Related papers (2022-04-25T16:33:37Z)
Sparse Needlets for Lighting Estimation with Spherical Transport Loss [89.52531416604774]
NeedleLight is a new lighting estimation model that represents illumination with needlets and allows lighting estimation in both frequency domain and spatial domain jointly. Extensive experiments show that NeedleLight achieves superior lighting estimation consistently across multiple evaluation metrics as compared with state-of-the-art methods.
arXiv Detail & Related papers (2021-06-24T15:19:42Z)
Object-based Illumination Estimation with Rendering-aware Neural Networks [56.01734918693844]
We present a scheme for fast environment light estimation from the RGBD appearance of individual objects and their local image areas. With the estimated lighting, virtual objects can be rendered in AR scenarios with shading that is consistent to the real scene.
arXiv Detail & Related papers (2020-08-06T08:23:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.