CFDNet: A Generalizable Foggy Stereo Matching Network with Contrastive
Feature Distillation
- URL: http://arxiv.org/abs/2402.18181v2
- Date: Thu, 29 Feb 2024 07:42:53 GMT
- Title: CFDNet: A Generalizable Foggy Stereo Matching Network with Contrastive
Feature Distillation
- Authors: Zihua Liu, Yizhou Li and Masatoshi Okutomi
- Abstract summary: We introduce a framework based on contrastive feature distillation (CFD)
This strategy combines feature distillation from merged clean-fog features with contrastive learning, ensuring balanced dependence on fog depth hints and clean matching features.
- Score: 11.655465312241699
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Stereo matching under foggy scenes remains a challenging task since the
scattering effect degrades the visibility and results in less distinctive
features for dense correspondence matching. While some previous learning-based
methods integrated a physical scattering function for simultaneous
stereo-matching and dehazing, simply removing fog might not aid depth
estimation because the fog itself can provide crucial depth cues. In this work,
we introduce a framework based on contrastive feature distillation (CFD). This
strategy combines feature distillation from merged clean-fog features with
contrastive learning, ensuring balanced dependence on fog depth hints and clean
matching features. This framework helps to enhance model generalization across
both clean and foggy environments. Comprehensive experiments on synthetic and
real-world datasets affirm the superior strength and adaptability of our
method.
Related papers
- Self-Supervised Enhancement of Forward-Looking Sonar Images: Bridging Cross-Modal Degradation Gaps through Feature Space Transformation and Multi-Frame Fusion [17.384482405769567]
Enhancing forward-looking sonar images is critical for accurate underwater target detection.
We propose a feature-space transformation that maps sonar images from the pixel domain to a robust feature domain.
Our method significantly outperforms existing approaches, effectively suppressing noise, preserving detailed edges, and substantially improving brightness.
arXiv Detail & Related papers (2025-04-15T08:34:56Z) - Pseudo-Label Guided Real-World Image De-weathering: A Learning Framework with Imperfect Supervision [57.5699142476311]
We propose a unified solution for real-world image de-weathering with non-ideal supervision.
Our method exhibits significant advantages when trained on imperfectly aligned de-weathering datasets.
arXiv Detail & Related papers (2025-04-14T07:24:03Z) - Every SAM Drop Counts: Embracing Semantic Priors for Multi-Modality Image Fusion and Beyond [52.486290612938895]
We propose a novel method that leverages the semantic knowledge from the Segment Anything Model (SAM) to Grow the quality of fusion results and Enable downstream task adaptability.
Specifically, we design a Semantic Persistent Attention (SPA) Module that efficiently maintains source information via the persistent repository while extracting high-level semantic priors from SAM.
Our method achieves a balance between high-quality visual results and downstream task adaptability while maintaining practical deployment efficiency.
arXiv Detail & Related papers (2025-03-03T06:16:31Z) - FaithDiff: Unleashing Diffusion Priors for Faithful Image Super-resolution [48.88184541515326]
We propose a simple and effective method, named FaithDiff, to fully harness the power of latent diffusion models (LDMs) for faithful image SR.
In contrast to existing diffusion-based SR methods that freeze the diffusion model pre-trained on high-quality images, we propose to unleash the diffusion prior to identify useful information and recover faithful structures.
arXiv Detail & Related papers (2024-11-27T23:58:03Z) - Improving Apple Object Detection with Occlusion-Enhanced Distillation [1.0049237739132246]
Apples growing in natural environments often face severe visual obstructions from leaves and branches.
We introduce a technique called "Occlusion-Enhanced Distillation" (OED) to regularize the learning of semantically aligned features on occluded datasets.
Our method significantly outperforms current state-of-the-art techniques through extensive comparative experiments.
arXiv Detail & Related papers (2024-09-03T03:11:48Z) - ODCR: Orthogonal Decoupling Contrastive Regularization for Unpaired Image Dehazing [2.5944091779488123]
Unrelated image dehazing (UID) holds significant research importance due to the challenges in acquiring haze/clear image pairs with identical backgrounds.
This paper proposes a novel method for UID named Orthogonal Decoupling Contrastive Regularization (ODCR)
arXiv Detail & Related papers (2024-04-27T08:13:13Z) - MaeFuse: Transferring Omni Features with Pretrained Masked Autoencoders for Infrared and Visible Image Fusion via Guided Training [57.18758272617101]
MaeFuse is a novel autoencoder model designed for infrared and visible image fusion (IVIF)
Our model utilizes a pretrained encoder from Masked Autoencoders (MAE), which facilities the omni features extraction for low-level reconstruction and high-level vision tasks.
MaeFuse not only introduces a novel perspective in the realm of fusion techniques but also stands out with impressive performance across various public datasets.
arXiv Detail & Related papers (2024-04-17T02:47:39Z) - Digging into contrastive learning for robust depth estimation with diffusion models [55.62276027922499]
We propose a novel robust depth estimation method called D4RD.
It features a custom contrastive learning mode tailored for diffusion models to mitigate performance degradation in complex environments.
In experiments, D4RD surpasses existing state-of-the-art solutions on synthetic corruption datasets and real-world weather conditions.
arXiv Detail & Related papers (2024-04-15T14:29:47Z) - Unveiling the Depths: A Multi-Modal Fusion Framework for Challenging
Scenarios [103.72094710263656]
This paper presents a novel approach that identifies and integrates dominant cross-modality depth features with a learning-based framework.
We propose a novel confidence loss steering a confidence predictor network to yield a confidence map specifying latent potential depth areas.
With the resulting confidence map, we propose a multi-modal fusion network that fuses the final depth in an end-to-end manner.
arXiv Detail & Related papers (2024-02-19T04:39:16Z) - Object-centric Cross-modal Feature Distillation for Event-based Object
Detection [87.50272918262361]
RGB detectors still outperform event-based detectors due to sparsity of the event data and missing visual details.
We develop a novel knowledge distillation approach to shrink the performance gap between these two modalities.
We show that object-centric distillation allows to significantly improve the performance of the event-based student object detector.
arXiv Detail & Related papers (2023-11-09T16:33:08Z) - PAIF: Perception-Aware Infrared-Visible Image Fusion for Attack-Tolerant
Semantic Segmentation [50.556961575275345]
We propose a perception-aware fusion framework to promote segmentation robustness in adversarial scenes.
We show that our scheme substantially enhances the robustness, with gains of 15.3% mIOU, compared with advanced competitors.
arXiv Detail & Related papers (2023-08-08T01:55:44Z) - Structure Representation Network and Uncertainty Feedback Learning for
Dense Non-Uniform Fog Removal [64.77435210892041]
We introduce a structure-representation network with uncertainty feedback learning.
Specifically, we extract the feature representations from a pre-trained Vision Transformer (DINO-ViT) module to recover the background information.
To handle the intractability of estimating the atmospheric light colors, we exploit the grayscale version of our input image.
arXiv Detail & Related papers (2022-10-06T17:10:57Z) - Nonlinear Intensity Underwater Sonar Image Matching Method Based on
Phase Information and Deep Convolution Features [6.759506053568929]
This paper proposes a combined matching method based on phase information and deep convolution features.
It has two outstanding advantages: one is that the deep convolution features could be used to measure the similarity of the local and global positions of the sonar image.
arXiv Detail & Related papers (2021-11-29T02:36:49Z) - Blur-Attention: A boosting mechanism for non-uniform blurred image
restoration [27.075713246257596]
We propose a blur-attention module to dynamically capture the spatially varying features of non-uniform blurred images.
By introducing the blur-attention network into a conditional generation adversarial framework, we propose an end-to-end blind motion deblurring method.
Experimental results show that the deblurring capability of our method achieved outstanding objective performance in terms of PSNR, SSIM, and subjective visual quality.
arXiv Detail & Related papers (2020-08-19T16:07:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.