Bilevel Generative Learning for Low-Light Vision
- URL: http://arxiv.org/abs/2308.03381v1
- Date: Mon, 7 Aug 2023 07:59:56 GMT
- Title: Bilevel Generative Learning for Low-Light Vision
- Authors: Yingchi Liu, Zhu Liu, Long Ma, Jinyuan Liu, Xin Fan, Zhongxuan Luo,
Risheng Liu
- Abstract summary: We propose a generic low-light vision solution by introducing a generative block to convert data from the RAW to the RGB domain.
This novel approach connects diverse vision problems by explicitly depicting data generation, which is the first in the field.
We develop two types of learning strategies targeting different goals, namely low cost and high accuracy, to acquire a new bilevel generative learning paradigm.
- Score: 64.77933848939327
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, there has been a growing interest in constructing deep learning
schemes for Low-Light Vision (LLV). Existing techniques primarily focus on
designing task-specific and data-dependent vision models on the standard RGB
domain, which inherently contain latent data associations. In this study, we
propose a generic low-light vision solution by introducing a generative block
to convert data from the RAW to the RGB domain. This novel approach connects
diverse vision problems by explicitly depicting data generation, which is the
first in the field. To precisely characterize the latent correspondence between
the generative procedure and the vision task, we establish a bilevel model with
the parameters of the generative block defined as the upper level and the
parameters of the vision task defined as the lower level. We further develop
two types of learning strategies targeting different goals, namely low cost and
high accuracy, to acquire a new bilevel generative learning paradigm. The
generative blocks embrace a strong generalization ability in other low-light
vision tasks through the bilevel optimization on enhancement tasks. Extensive
experimental evaluations on three representative low-light vision tasks, namely
enhancement, detection, and segmentation, fully demonstrate the superiority of
our proposed approach. The code will be available at
https://github.com/Yingchi1998/BGL.
Related papers
- Instruction-Guided Fusion of Multi-Layer Visual Features in Large Vision-Language Models [50.98559225639266]
We investigate the contributions of visual features from different encoder layers using 18 benchmarks spanning 6 task categories.
Our findings reveal that multilayer features provide complementary strengths with varying task dependencies, and uniform fusion leads to suboptimal performance.
We propose the instruction-guided vision aggregator, a module that dynamically integrates multi-layer visual features based on textual instructions.
arXiv Detail & Related papers (2024-12-26T05:41:31Z) - World-Consistent Data Generation for Vision-and-Language Navigation [52.08816337783936]
Vision-and-Language Navigation (VLN) is a challenging task that requires an agent to navigate through photorealistic environments following natural-language instructions.
One main obstacle existing in VLN is data scarcity, leading to poor generalization performance over unseen environments.
We propose the world-consistent data generation (WCGEN), an efficacious data-augmentation framework satisfying both diversity and world-consistency.
arXiv Detail & Related papers (2024-12-09T11:40:54Z) - LaVin-DiT: Large Vision Diffusion Transformer [99.98106406059333]
LaVin-DiT is a scalable and unified foundation model designed to tackle over 20 computer vision tasks in a generative framework.
We introduce key innovations to optimize generative performance for vision tasks.
The model is scaled from 0.1B to 3.4B parameters, demonstrating substantial scalability and state-of-the-art performance across diverse vision tasks.
arXiv Detail & Related papers (2024-11-18T12:05:27Z) - Unsupervised Variational Translator for Bridging Image Restoration and High-Level Vision Tasks [24.076965636237098]
We propose an unsupervised learning method called textVariational textbfTranslator (VaT), which does not require retraining existing restoration and high-level vision networks.
VaT achieves the above optimization objective without requiring labels.
Experiments in dehazing and low-light enhancement for detection and classification show the superiority of our method over other state-of-the-art unsupervised counterparts.
arXiv Detail & Related papers (2024-08-15T13:35:59Z) - Generative-Enhanced Heterogeneous Graph Contrastive Learning [11.118517297006894]
Heterogeneous Graphs (HGs) can effectively model complex relationships in the real world by multi-type nodes and edges.
In recent years, inspired by self-supervised learning, contrastive Heterogeneous Graphs Neural Networks (HGNNs) have shown great potential by utilizing data augmentation and contrastive discriminators for downstream tasks.
We propose a novel Generative-Enhanced Heterogeneous Graph Contrastive Learning (GHGCL)
arXiv Detail & Related papers (2024-04-03T15:31:18Z) - Data-efficient Large Vision Models through Sequential Autoregression [58.26179273091461]
We develop an efficient, autoregression-based vision model on a limited dataset.
We demonstrate how this model achieves proficiency in a spectrum of visual tasks spanning both high-level and low-level semantic understanding.
Our empirical evaluations underscore the model's agility in adapting to various tasks, heralding a significant reduction in the parameter footprint.
arXiv Detail & Related papers (2024-02-07T13:41:53Z) - Bilevel Fast Scene Adaptation for Low-Light Image Enhancement [50.639332885989255]
Enhancing images in low-light scenes is a challenging but widely concerned task in the computer vision.
Main obstacle lies in the modeling conundrum from distribution discrepancy across different scenes.
We introduce the bilevel paradigm to model the above latent correspondence.
A bilevel learning framework is constructed to endow the scene-irrelevant generality of the encoder towards diverse scenes.
arXiv Detail & Related papers (2023-06-02T08:16:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.