A Comparative Analysis of Decision-Level Fusion for Multimodal Driver
Behaviour Understanding
- URL: http://arxiv.org/abs/2204.04734v1
- Date: Sun, 10 Apr 2022 17:49:22 GMT
- Title: A Comparative Analysis of Decision-Level Fusion for Multimodal Driver
Behaviour Understanding
- Authors: Alina Roitberg, Kunyu Peng, Zdravko Marinov, Constantin Seibold, David
Schneider, Rainer Stiefelhagen
- Abstract summary: This paper presents an empirical evaluation of different paradigms for decision-level late fusion in video-based driver observation.
We compare seven different mechanisms for joining the results of single-modal classifiers.
This is the first systematic study of strategies for fusing outcomes of multimodal predictors inside the vehicles.
- Score: 22.405229530620414
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual recognition inside the vehicle cabin leads to safer driving and more
intuitive human-vehicle interaction but such systems face substantial obstacles
as they need to capture different granularities of driver behaviour while
dealing with highly limited body visibility and changing illumination.
Multimodal recognition mitigates a number of such issues: prediction outcomes
of different sensors complement each other due to different modality-specific
strengths and weaknesses. While several late fusion methods have been
considered in previously published frameworks, they constantly feature
different architecture backbones and building blocks making it very hard to
isolate the role of the chosen late fusion strategy itself. This paper presents
an empirical evaluation of different paradigms for decision-level late fusion
in video-based driver observation. We compare seven different mechanisms for
joining the results of single-modal classifiers which have been both popular,
(e.g. score averaging) and not yet considered (e.g. rank-level fusion) in the
context of driver observation evaluating them based on different criteria and
benchmark settings. This is the first systematic study of strategies for fusing
outcomes of multimodal predictors inside the vehicles, conducted with the goal
to provide guidance for fusion scheme selection.
Related papers
- MMLF: Multi-modal Multi-class Late Fusion for Object Detection with Uncertainty Estimation [13.624431305114564]
This paper introduces a pioneering Multi-modal Multi-class Late Fusion method, designed for late fusion to enable multi-class detection.
Experiments conducted on the KITTI validation and official test datasets illustrate substantial performance improvements.
Our approach incorporates uncertainty analysis into the classification fusion process, rendering our model more transparent and trustworthy.
arXiv Detail & Related papers (2024-10-11T11:58:35Z) - POGEMA: A Benchmark Platform for Cooperative Multi-Agent Navigation [76.67608003501479]
We introduce and specify an evaluation protocol defining a range of domain-related metrics computed on the basics of the primary evaluation indicators.
The results of such a comparison, which involves a variety of state-of-the-art MARL, search-based, and hybrid methods, are presented.
arXiv Detail & Related papers (2024-07-20T16:37:21Z) - Beyond One Model Fits All: Ensemble Deep Learning for Autonomous
Vehicles [16.398646583844286]
This study introduces three distinct neural network models corresponding to Mediated Perception, Behavior Reflex, and Direct Perception approaches.
Our architecture fuses information from the base, future latent vector prediction, and auxiliary task networks, using global routing commands to select appropriate action sub-networks.
arXiv Detail & Related papers (2023-12-10T04:40:02Z) - Interactive Autonomous Navigation with Internal State Inference and
Interactivity Estimation [58.21683603243387]
We propose three auxiliary tasks with relational-temporal reasoning and integrate them into the standard Deep Learning framework.
These auxiliary tasks provide additional supervision signals to infer the behavior patterns other interactive agents.
Our approach achieves robust and state-of-the-art performance in terms of standard evaluation metrics.
arXiv Detail & Related papers (2023-11-27T18:57:42Z) - Robust Multiview Multimodal Driver Monitoring System Using Masked
Multi-Head Self-Attention [28.18784311981388]
We propose a novel multiview multimodal driver monitoring system based on feature-level fusion through multi-head self-attention (MHSA)
We demonstrate its effectiveness by comparing it against four alternative fusion strategies (Sum, Convarity, SE, and AFF)
Experiments on this enhanced database demonstrate that 1) the proposed MHSA-based fusion method (AUC-ROC: 97.0%) outperforms all baselines and previous approaches, and 2) training MHSA with patch masking can improve its robustness against modality/view collapses.
arXiv Detail & Related papers (2023-04-13T09:50:32Z) - Unified Automatic Control of Vehicular Systems with Reinforcement
Learning [64.63619662693068]
This article contributes a streamlined methodology for vehicular microsimulation.
It discovers high performance control strategies with minimal manual design.
The study reveals numerous emergent behaviors resembling wave mitigation, traffic signaling, and ramp metering.
arXiv Detail & Related papers (2022-07-30T16:23:45Z) - Multi-modal Sensor Fusion for Auto Driving Perception: A Survey [22.734013343067407]
We provide a literature review of the existing multi-modal-based methods for perception tasks in autonomous driving.
We propose an innovative way that divides them into two major classes, four minor classes by a more reasonable taxonomy in the view of the fusion stage.
arXiv Detail & Related papers (2022-02-06T04:18:45Z) - Divide-and-Conquer for Lane-Aware Diverse Trajectory Prediction [71.97877759413272]
Trajectory prediction is a safety-critical tool for autonomous vehicles to plan and execute actions.
Recent methods have achieved strong performances using Multi-Choice Learning objectives like winner-takes-all (WTA) or best-of-many.
Our work addresses two key challenges in trajectory prediction, learning outputs, and better predictions by imposing constraints using driving knowledge.
arXiv Detail & Related papers (2021-04-16T17:58:56Z) - Multimodal Object Detection via Bayesian Fusion [59.31437166291557]
We study multimodal object detection with RGB and thermal cameras, since the latter can provide much stronger object signatures under poor illumination.
Our key contribution is a non-learned late-fusion method that fuses together bounding box detections from different modalities.
We apply our approach to benchmarks containing both aligned (KAIST) and unaligned (FLIR) multimodal sensor data.
arXiv Detail & Related papers (2021-04-07T04:03:20Z) - Studying Person-Specific Pointing and Gaze Behavior for Multimodal
Referencing of Outside Objects from a Moving Vehicle [58.720142291102135]
Hand pointing and eye gaze have been extensively investigated in automotive applications for object selection and referencing.
Existing outside-the-vehicle referencing methods focus on a static situation, whereas the situation in a moving vehicle is highly dynamic and subject to safety-critical constraints.
We investigate the specific characteristics of each modality and the interaction between them when used in the task of referencing outside objects.
arXiv Detail & Related papers (2020-09-23T14:56:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.