SimROD: A Simple Baseline for Raw Object Detection with Global and Local Enhancements
- URL: http://arxiv.org/abs/2503.07101v1
- Date: Mon, 10 Mar 2025 09:23:14 GMT
- Title: SimROD: A Simple Baseline for Raw Object Detection with Global and Local Enhancements
- Authors: Haiyang Xie, Xi Shen, Shihua Huang, Zheng Wang,
- Abstract summary: We propose SimROD, a lightweight and effective approach for RAW object detection.<n>We introduce a Global Gamma Enhancement (GGE) module, which applies a learnable global gamma transformation with only four parameters.<n>Our work highlights the potential of RAW data for real-world object detection.
- Score: 7.08243476424994
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most visual models are designed for sRGB images, yet RAW data offers significant advantages for object detection by preserving sensor information before ISP processing. This enables improved detection accuracy and more efficient hardware designs by bypassing the ISP. However, RAW object detection is challenging due to limited training data, unbalanced pixel distributions, and sensor noise. To address this, we propose SimROD, a lightweight and effective approach for RAW object detection. We introduce a Global Gamma Enhancement (GGE) module, which applies a learnable global gamma transformation with only four parameters, improving feature representation while keeping the model efficient. Additionally, we leverage the green channel's richer signal to enhance local details, aligning with the human eye's sensitivity and Bayer filter design. Extensive experiments on multiple RAW object detection datasets and detectors demonstrate that SimROD outperforms state-of-the-art methods like RAW-Adapter and DIAP while maintaining efficiency. Our work highlights the potential of RAW data for real-world object detection.
Related papers
- Beyond RGB: Adaptive Parallel Processing for RAW Object Detection [5.36869872375791]
Raw Adaptation Module (RAM) is a module designed to replace the traditional Image Signal Processing (ISP)
Our approach outperforms RGB-based methods and achieves state-of-the-art results across diverse RAW image datasets.
arXiv Detail & Related papers (2025-03-17T13:36:49Z) - Towards RAW Object Detection in Diverse Conditions [65.30190654593842]
We introduce the AODRaw dataset, which offers 7,785 high-resolution real RAW images with 135,601 annotated instances spanning 62 categories.
We find that sRGB pre-training constrains the potential of RAW object detection due to the domain gap between sRGB and RAW.
We distill the knowledge from an off-the-shelf model pre-trained on the sRGB domain to assist RAW pre-training.
arXiv Detail & Related papers (2024-11-24T01:23:04Z) - RAW-Diffusion: RGB-Guided Diffusion Models for High-Fidelity RAW Image Generation [4.625376287612609]
We propose a novel diffusion-based method for generating RAW images guided by RGB images.
This approach yields high-fidelity RAW images, enabling the creation of camera-specific RAW datasets.
We extend our method to create BDD100K-RAW and Cityscapes-RAW datasets, revealing its effectiveness for object detection in RAW imagery.
arXiv Detail & Related papers (2024-11-20T09:40:12Z) - Unveiling Hidden Details: A RAW Data-Enhanced Paradigm for Real-World Super-Resolution [56.98910228239627]
Real-world image super-resolution (Real SR) aims to generate high-fidelity, detail-rich high-resolution (HR) images from low-resolution (LR) counterparts.
Existing Real SR methods primarily focus on generating details from the LR RGB domain, often leading to a lack of richness or fidelity in fine details.
We pioneer the use of details hidden in RAW data to complement existing RGB-only methods, yielding superior outputs.
arXiv Detail & Related papers (2024-11-16T13:29:50Z) - Enhanced Automotive Object Detection via RGB-D Fusion in a DiffusionDet Framework [0.0]
Vision-based autonomous driving requires reliable and efficient object detection.
This work proposes a DiffusionDet-based framework that exploits data fusion from the monocular camera and depth sensor to provide the RGB and depth (RGB-D) data.
By integrating the textural and color features from RGB images with the spatial depth information from the LiDAR sensors, the proposed framework employs a feature fusion that substantially enhances object detection of automotive targets.
arXiv Detail & Related papers (2024-06-05T10:24:00Z) - SIRST-5K: Exploring Massive Negatives Synthesis with Self-supervised
Learning for Robust Infrared Small Target Detection [53.19618419772467]
Single-frame infrared small target (SIRST) detection aims to recognize small targets from clutter backgrounds.
With the development of Transformer, the scale of SIRST models is constantly increasing.
With a rich diversity of infrared small target data, our algorithm significantly improves the model performance and convergence speed.
arXiv Detail & Related papers (2024-03-08T16:14:54Z) - BSRAW: Improving Blind RAW Image Super-Resolution [63.408484584265985]
We tackle blind image super-resolution in the RAW domain.
We design a realistic degradation pipeline tailored specifically for training models with raw sensor data.
Our BSRAW models trained with our pipeline can upscale real-scene RAW images and improve their quality.
arXiv Detail & Related papers (2023-12-24T14:17:28Z) - Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - Reversed Image Signal Processing and RAW Reconstruction. AIM 2022
Challenge Report [109.2135194765743]
This paper introduces the AIM 2022 Challenge on Reversed Image Signal Processing and RAW Reconstruction.
We aim to recover raw sensor images from the corresponding RGBs without metadata and, by doing this, "reverse" the ISP transformation.
arXiv Detail & Related papers (2022-10-20T10:43:53Z) - GenISP: Neural ISP for Low-Light Machine Cognition [19.444297600977546]
In low-light conditions, object detectors using raw image data are more robust than detectors using image data processed by an ISP pipeline.
We propose a minimal neural ISP pipeline for machine cognition, named GenISP, that explicitly incorporates Color Space Transformation to a device-independent color space.
arXiv Detail & Related papers (2022-05-07T17:17:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.