Cocoon: Robust Multi-Modal Perception with Uncertainty-Aware Sensor Fusion
- URL: http://arxiv.org/abs/2410.12592v1
- Date: Wed, 16 Oct 2024 14:10:53 GMT
- Title: Cocoon: Robust Multi-Modal Perception with Uncertainty-Aware Sensor Fusion
- Authors: Minkyoung Cho, Yulong Cao, Jiachen Sun, Qingzhao Zhang, Marco Pavone, Jeong Joon Park, Heng Yang, Z. Morley Mao,
- Abstract summary: We introduce Cocoon, an object- and feature-level uncertainty-aware fusion framework.
Key innovation lies in uncertainty quantification for heterogeneous representations.
Cocoon consistently outperforms existing static and adaptive methods in both normal and challenging conditions.
- Score: 26.979291099052194
- License:
- Abstract: An important paradigm in 3D object detection is the use of multiple modalities to enhance accuracy in both normal and challenging conditions, particularly for long-tail scenarios. To address this, recent studies have explored two directions of adaptive approaches: MoE-based adaptive fusion, which struggles with uncertainties arising from distinct object configurations, and late fusion for output-level adaptive fusion, which relies on separate detection pipelines and limits comprehensive understanding. In this work, we introduce Cocoon, an object- and feature-level uncertainty-aware fusion framework. The key innovation lies in uncertainty quantification for heterogeneous representations, enabling fair comparison across modalities through the introduction of a feature aligner and a learnable surrogate ground truth, termed feature impression. We also define a training objective to ensure that their relationship provides a valid metric for uncertainty quantification. Cocoon consistently outperforms existing static and adaptive methods in both normal and challenging conditions, including those with natural and artificial corruptions. Furthermore, we show the validity and efficacy of our uncertainty metric across diverse datasets.
Related papers
- Know Where You're Uncertain When Planning with Multimodal Foundation Models: A Formal Framework [54.40508478482667]
We present a comprehensive framework to disentangle, quantify, and mitigate uncertainty in perception and plan generation.
We propose methods tailored to the unique properties of perception and decision-making.
We show that our uncertainty disentanglement framework reduces variability by up to 40% and enhances task success rates by 5% compared to baselines.
arXiv Detail & Related papers (2024-11-03T17:32:00Z) - UAHOI: Uncertainty-aware Robust Interaction Learning for HOI Detection [18.25576487115016]
This paper focuses on Human-Object Interaction (HOI) detection.
It addresses the challenge of identifying and understanding the interactions between humans and objects within a given image or video frame.
We propose a novel approach textscUAHOI, Uncertainty-aware Robust Human-Object Interaction Learning.
arXiv Detail & Related papers (2024-08-14T10:06:39Z) - Regulating Model Reliance on Non-Robust Features by Smoothing Input Marginal Density [93.32594873253534]
Trustworthy machine learning requires meticulous regulation of model reliance on non-robust features.
We propose a framework to delineate and regulate such features by attributing model predictions to the input.
arXiv Detail & Related papers (2024-07-05T09:16:56Z) - Mutual Information-calibrated Conformal Feature Fusion for
Uncertainty-Aware Multimodal 3D Object Detection at the Edge [1.7898305876314982]
Three-dimensional (3D) object detection, a critical robotics operation, has seen significant advancements.
Our study integrates the principles of conformal inference with information theoretic measures to perform lightweight, Monte Carlo-free uncertainty estimation.
The framework demonstrates comparable or better performance in KITTI 3D object detection benchmarks to similar methods that are not uncertainty-aware.
arXiv Detail & Related papers (2023-09-18T09:02:44Z) - A Sequentially Fair Mechanism for Multiple Sensitive Attributes [0.46040036610482665]
In the standard use case of Algorithmic Fairness, the goal is to eliminate the relationship between a sensitive variable and a corresponding score.
We propose a sequential framework, which allows to progressively achieve fairness across a set of sensitive features.
Our approach seamlessly extends to approximate fairness, enveloping a framework accommodating the trade-off between risk and unfairness.
arXiv Detail & Related papers (2023-09-12T22:31:57Z) - Cross-Attention is Not Enough: Incongruity-Aware Dynamic Hierarchical
Fusion for Multimodal Affect Recognition [69.32305810128994]
Incongruity between modalities poses a challenge for multimodal fusion, especially in affect recognition.
We propose the Hierarchical Crossmodal Transformer with Dynamic Modality Gating (HCT-DMG), a lightweight incongruity-aware model.
HCT-DMG: 1) outperforms previous multimodal models with a reduced size of approximately 0.8M parameters; 2) recognizes hard samples where incongruity makes affect recognition difficult; 3) mitigates the incongruity at the latent level in crossmodal attention.
arXiv Detail & Related papers (2023-05-23T01:24:15Z) - Composed Image Retrieval with Text Feedback via Multi-grained
Uncertainty Regularization [73.04187954213471]
We introduce a unified learning approach to simultaneously modeling the coarse- and fine-grained retrieval.
The proposed method has achieved +4.03%, +3.38%, and +2.40% Recall@50 accuracy over a strong baseline.
arXiv Detail & Related papers (2022-11-14T14:25:40Z) - Uncertainty Quantification of Collaborative Detection for Self-Driving [12.590332512097698]
Sharing information between connected and autonomous vehicles (CAVs) improves the performance of collaborative object detection for self-driving.
However, CAVs still have uncertainties on object detection due to practical challenges.
Our work is the first to estimate the uncertainty of collaborative object detection.
arXiv Detail & Related papers (2022-09-16T20:30:45Z) - Exploring the Trade-off between Plausibility, Change Intensity and
Adversarial Power in Counterfactual Explanations using Multi-objective
Optimization [73.89239820192894]
We argue that automated counterfactual generation should regard several aspects of the produced adversarial instances.
We present a novel framework for the generation of counterfactual examples.
arXiv Detail & Related papers (2022-05-20T15:02:53Z) - Robustness and Accuracy Could Be Reconcilable by (Proper) Definition [109.62614226793833]
The trade-off between robustness and accuracy has been widely studied in the adversarial literature.
We find that it may stem from the improperly defined robust error, which imposes an inductive bias of local invariance.
By definition, SCORE facilitates the reconciliation between robustness and accuracy, while still handling the worst-case uncertainty.
arXiv Detail & Related papers (2022-02-21T10:36:09Z) - Modal Uncertainty Estimation via Discrete Latent Representation [4.246061945756033]
We introduce a deep learning framework that learns the one-to-many mappings between the inputs and outputs, together with faithful uncertainty measures.
Our framework demonstrates significantly more accurate uncertainty estimation than the current state-of-the-art methods.
arXiv Detail & Related papers (2020-07-25T05:29:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.