4KDehazeFlow: Ultra-High-Definition Image Dehazing via Flow Matching
- URL: http://arxiv.org/abs/2511.09055v1
- Date: Thu, 13 Nov 2025 01:29:06 GMT
- Title: 4KDehazeFlow: Ultra-High-Definition Image Dehazing via Flow Matching
- Authors: Xingchi Chen, Pu Wang, Xuerui Li, Chaopeng Li, Juxiang Zhou, Jianhou Gan, Dianjie Lu, Guijuan Zhang, Wenqi Ren, Zhuoran Zheng,
- Abstract summary: 4KDehazeFlow is a novel method based on Flow Matching and the Haze-Aware vector field.<n>It provides efficient data-driven adaptive nonlinear color transformation for high-quality dehazing.<n>It delivers a 2dB PSNR increase and better performance in dense haze and color fidelity.
- Score: 47.857232695201645
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Ultra-High-Definition (UHD) image dehazing faces challenges such as limited scene adaptability in prior-based methods and high computational complexity with color distortion in deep learning approaches. To address these issues, we propose 4KDehazeFlow, a novel method based on Flow Matching and the Haze-Aware vector field. This method models the dehazing process as a progressive optimization of continuous vector field flow, providing efficient data-driven adaptive nonlinear color transformation for high-quality dehazing. Specifically, our method has the following advantages: 1) 4KDehazeFlow is a general method compatible with various deep learning networks, without relying on any specific network architecture. 2) We propose a learnable 3D lookup table (LUT) that encodes haze transformation parameters into a compact 3D mapping matrix, enabling efficient inference through precomputed mappings. 3) We utilize a fourth-order Runge-Kutta (RK4) ordinary differential equation (ODE) solver to stably solve the dehazing flow field through an accurate step-by-step iterative method, effectively suppressing artifacts. Extensive experiments show that 4KDehazeFlow exceeds seven state-of-the-art methods. It delivers a 2dB PSNR increase and better performance in dense haze and color fidelity.
Related papers
- SF3D-RGB: Scene Flow Estimation from Monocular Camera and Sparse LiDAR [17.224692757126153]
We present a deep learning architecture for sparse scene flow estimation using 2D monocular images and 3D point clouds.<n>Our architecture is an end-to-end model that first encodes information from each modality into features and fuses them together.<n>Experiments show that our proposed method outperforms single-modality methods and achieves better scene flow accuracy on real-world datasets.
arXiv Detail & Related papers (2026-02-25T09:03:42Z) - Fast & Efficient Normalizing Flows and Applications of Image Generative Models [0.0]
thesis presents novel contributions in two primary areas: advancing the efficiency of generative models, particularly normalizing flows, and applying generative models to solve real-world computer vision challenges.<n>The first part introduce significant improvements to normalizing flow architectures through six key innovations: 1) Development of invertible 3x3 Convolution layers with mathematically proven necessary and sufficient conditions for invertibility, 2) introduction of a more efficient Quad-coupling layer, 3) Design of a fast and efficient parallel inversion algorithm for kxk convolutional layers, 4) Fast & efficient backpropagation algorithm for inverse of convolution, 5) Using inverse of convolution, in Inverse-
arXiv Detail & Related papers (2025-12-03T18:29:03Z) - DriveFlow: Rectified Flow Adaptation for Robust 3D Object Detection in Autonomous Driving [85.14946767994932]
DriveFlow is a Rectified Flow Adaptation method for training data enhancement in autonomous driving.<n>It incorporates a high-frequency alignment loss for foreground to maintain precise 3D object geometry.<n>It also conducts dual-frequency optimization for background, balancing editing flexibility and semantic consistency.
arXiv Detail & Related papers (2025-11-24T03:12:43Z) - Diff4Splat: Controllable 4D Scene Generation with Latent Dynamic Reconstruction Models [79.06910348413861]
We introduce Diff4Splat, a feed-forward method that synthesizes controllable and explicit 4D scenes from a single image.<n>Given a single input image, a camera trajectory, and an optional text prompt, Diff4Splat directly predicts a deformable 3D Gaussian field that encodes appearance, geometry, and motion.
arXiv Detail & Related papers (2025-11-01T11:16:25Z) - FlowLUT: Efficient Image Enhancement via Differentiable LUTs and Iterative Flow Matching [10.213645938731338]
FlowLUT is a novel end-to-end model that integrates the efficiency of LUTs, multiple priors, and the parameter-independent characteristic of flow-matched reconstructed images.<n>A lightweight fusion prediction network runs on multiple 3D LUTs, with $mathcalO(1)$ complexity for scene-adaptive color correction.<n>The entire model is jointly optimized under a composite loss function enforcing perceptual and structural fidelity.
arXiv Detail & Related papers (2025-09-28T03:22:01Z) - Pseudo Depth Meets Gaussian: A Feed-forward RGB SLAM Baseline [64.42938561167402]
We propose an online 3D reconstruction method using 3D Gaussian-based SLAM, combined with a feed-forward recurrent prediction module.<n>This approach replaces slow test-time optimization with fast network inference, significantly improving tracking speed.<n>Our method achieves performance on par with the state-of-the-art SplaTAM, while reducing tracking time by more than 90%.
arXiv Detail & Related papers (2025-08-06T16:16:58Z) - Gaussian Primitives for Deformable Image Registration [9.184092856125067]
Experimental results on brain MRI, lung CT, and cardiac MRI datasets demonstrate that GaussianDIR outperforms existing DIR methods in both accuracy and efficiency.
As a training-free approach, it challenges the stereotype that iterative methods are inherently slow and transcend the limitations of poor generalization.
arXiv Detail & Related papers (2024-06-05T15:44:54Z) - TriPlaneNet: An Encoder for EG3D Inversion [1.9567015559455132]
NeRF-based GANs have introduced a number of approaches for high-resolution and high-fidelity generative modeling of human heads.
Despite the success of universal optimization-based methods for 2D GAN inversion, those applied to 3D GANs may fail to extrapolate the result onto the novel view.
We introduce a fast technique that bridges the gap between the two approaches by directly utilizing the tri-plane representation presented for the EG3D generative model.
arXiv Detail & Related papers (2023-03-23T17:56:20Z) - 4K-NeRF: High Fidelity Neural Radiance Fields at Ultra High Resolutions [19.380248980850727]
We present a novel and effective framework, named 4K-NeRF, to pursue high fidelity view synthesis on the challenging scenarios of ultra high resolutions.
We address the issue by exploring ray correlation to enhance high-frequency details recovery.
Our method can significantly boost rendering quality on high-frequency details compared with modern NeRF methods, and achieve the state-of-the-art visual quality on 4K ultra-high-resolution scenarios.
arXiv Detail & Related papers (2022-12-09T07:26:49Z) - V4d: voxel for 4d novel view synthesis [21.985228924523543]
We utilize 3D Voxel to model the 4D neural radiance field, short as V4D, where the 3D voxel has two formats.
The proposed LUTs-based refinement module achieves the performance gain with little computational cost.
arXiv Detail & Related papers (2022-05-28T04:45:07Z) - Displacement-Invariant Cost Computation for Efficient Stereo Matching [122.94051630000934]
Deep learning methods have dominated stereo matching leaderboards by yielding unprecedented disparity accuracy.
But their inference time is typically slow, on the order of seconds for a pair of 540p images.
We propose a emphdisplacement-invariant cost module to compute the matching costs without needing a 4D feature volume.
arXiv Detail & Related papers (2020-12-01T23:58:16Z) - Learning Deformable Tetrahedral Meshes for 3D Reconstruction [78.0514377738632]
3D shape representations that accommodate learning-based 3D reconstruction are an open problem in machine learning and computer graphics.
Previous work on neural 3D reconstruction demonstrated benefits, but also limitations, of point cloud, voxel, surface mesh, and implicit function representations.
We introduce Deformable Tetrahedral Meshes (DefTet) as a particular parameterization that utilizes volumetric tetrahedral meshes for the reconstruction problem.
arXiv Detail & Related papers (2020-11-03T02:57:01Z) - Towards Reading Beyond Faces for Sparsity-Aware 4D Affect Recognition [55.15661254072032]
We present a sparsity-aware deep network for automatic 4D facial expression recognition (FER)
We first propose a novel augmentation method to combat the data limitation problem for deep learning.
We then present a sparsity-aware deep network to compute the sparse representations of convolutional features over multi-views.
arXiv Detail & Related papers (2020-02-08T13:09:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.