Related papers: ELFNet: Evidential Local-global Fusion for Stereo Matching

ELFNet: Evidential Local-global Fusion for Stereo Matching

URL: http://arxiv.org/abs/2308.00728v1
Date: Tue, 1 Aug 2023 15:51:04 GMT
Title: ELFNet: Evidential Local-global Fusion for Stereo Matching
Authors: Jieming Lou, Weide Liu, Zhuo Chen, Fayao Liu, and Jun Cheng
Abstract summary: We introduce the textbfEvidential textbfLocal-global textbfFusion (ELF) framework for stereo matching. It endows both uncertainty estimation and confidence-aware fusion with trustworthy heads.
Score: 17.675146012208124
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Although existing stereo matching models have achieved continuous improvement, they often face issues related to trustworthiness due to the absence of uncertainty estimation. Additionally, effectively leveraging multi-scale and multi-view knowledge of stereo pairs remains unexplored. In this paper, we introduce the \textbf{E}vidential \textbf{L}ocal-global \textbf{F}usion (ELF) framework for stereo matching, which endows both uncertainty estimation and confidence-aware fusion with trustworthy heads. Instead of predicting the disparity map alone, our model estimates an evidential-based disparity considering both aleatoric and epistemic uncertainties. With the normal inverse-Gamma distribution as a bridge, the proposed framework realizes intra evidential fusion of multi-level predictions and inter evidential fusion between cost-volume-based and transformer-based stereo matching. Extensive experimental results show that the proposed framework exploits multi-view information effectively and achieves state-of-the-art overall performance both on accuracy and cross-domain generalization. The codes are available at https://github.com/jimmy19991222/ELFNet.

Related papers

Integrating Disparity Confidence Estimation into Relative Depth Prior-Guided Unsupervised Stereo Matching [55.784713740698365]
Unsupervised stereo matching has garnered significant attention for its independence from costly disparity annotations.<n>A feasible solution lies in transferring 3D geometric knowledge from a relative depth map to the stereo matching networks.<n>This work proposes a novel unsupervised learning framework to address these challenges.
arXiv Detail & Related papers (2025-08-02T09:11:05Z)
Uncertainty Quantification for Incomplete Multi-View Data Using Divergence Measures [16.7647980166695]
KPHD-Net, based on H"older divergence, is proposed for multi-view classification and clustering tasks.<n>Our theoretical analysis demonstrates that Proper H"older divergence offers a more effective measure of distribution discrepancies.<n>Extensive experiments show that the proposed KPHD-Net outperforms the current state-of-the-art methods in both classification and clustering tasks.
arXiv Detail & Related papers (2025-07-14T06:55:32Z)
Probabilistic Modeling of Disparity Uncertainty for Robust and Efficient Stereo Matching [61.73532883992135]
We propose a new uncertainty-aware stereo matching framework. We adopt Bayes risk as the measurement of uncertainty and use it to separately estimate data and model uncertainty.
arXiv Detail & Related papers (2024-12-24T23:28:20Z)
Confidence-aware multi-modality learning for eye disease screening [58.861421804458395]
We propose a novel multi-modality evidential fusion pipeline for eye disease screening. It provides a measure of confidence for each modality and elegantly integrates the multi-modality information. Experimental results on both public and internal datasets demonstrate that our model excels in robustness.
arXiv Detail & Related papers (2024-05-28T13:27:30Z)
MERIT: Multi-view Evidential learning for Reliable and Interpretable liver fibrosis sTaging [29.542924813666698]
We propose a new multi-view method based on evidential learning, referred to as MERIT. MERIT enables uncertainty of the predictions to enhance reliability, and employs a logic-based combination rule to improve interpretability. Results have showcased the effectiveness of the proposed MERIT, highlighting the reliability and offering both ad-hoc and post-hoc interpretability.
arXiv Detail & Related papers (2024-05-05T12:52:28Z)
Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval [139.21955930418815]
Cross-modal Retrieval methods build similarity relations between vision and language modalities by jointly learning a common representation space. However, the predictions are often unreliable due to the Aleatoric uncertainty, which is induced by low-quality data, e.g., corrupt images, fast-paced videos, and non-detailed texts. We propose a novel Prototype-based Aleatoric Uncertainty Quantification (PAU) framework to provide trustworthy predictions by quantifying the uncertainty arisen from the inherent data ambiguity.
arXiv Detail & Related papers (2023-09-29T09:41:19Z)
Integrating Large Pre-trained Models into Multimodal Named Entity Recognition with Evidential Fusion [31.234455370113075]
We propose incorporating uncertainty estimation into the MNER task, producing trustworthy predictions. Our proposed algorithm models the distribution of each modality as a Normal-inverse Gamma distribution, and fuses them into a unified distribution. Experiments on two datasets demonstrate that our proposed method outperforms the baselines and achieves new state-of-the-art performance.
arXiv Detail & Related papers (2023-06-29T14:50:23Z)
Cross-Attention is Not Enough: Incongruity-Aware Dynamic Hierarchical Fusion for Multimodal Affect Recognition [69.32305810128994]
Incongruity between modalities poses a challenge for multimodal fusion, especially in affect recognition. We propose the Hierarchical Crossmodal Transformer with Dynamic Modality Gating (HCT-DMG), a lightweight incongruity-aware model. HCT-DMG: 1) outperforms previous multimodal models with a reduced size of approximately 0.8M parameters; 2) recognizes hard samples where incongruity makes affect recognition difficult; 3) mitigates the incongruity at the latent level in crossmodal attention.
arXiv Detail & Related papers (2023-05-23T01:24:15Z)
Trusted Multi-View Classification with Dynamic Evidential Fusion [73.35990456162745]
We propose a novel multi-view classification algorithm, termed trusted multi-view classification (TMC) TMC provides a new paradigm for multi-view learning by dynamically integrating different views at an evidence level. Both theoretical and experimental results validate the effectiveness of the proposed model in accuracy, robustness and trustworthiness.
arXiv Detail & Related papers (2022-04-25T03:48:49Z)
Trustworthy Multimodal Regression with Mixture of Normal-inverse Gamma Distributions [91.63716984911278]
We introduce a novel Mixture of Normal-Inverse Gamma distributions (MoNIG) algorithm, which efficiently estimates uncertainty in principle for adaptive integration of different modalities and produces a trustworthy regression result. Experimental results on both synthetic and different real-world data demonstrate the effectiveness and trustworthiness of our method on various multimodal regression tasks.
arXiv Detail & Related papers (2021-11-11T14:28:12Z)
Trusted Multi-View Classification [76.73585034192894]
We propose a novel multi-view classification method, termed trusted multi-view classification. It provides a new paradigm for multi-view learning by dynamically integrating different views at an evidence level. The proposed algorithm jointly utilizes multiple views to promote both classification reliability and robustness.
arXiv Detail & Related papers (2021-02-03T13:30:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.