Causal Mode Multiplexer: A Novel Framework for Unbiased Multispectral Pedestrian Detection
- URL: http://arxiv.org/abs/2403.01300v2
- Date: Fri, 5 Apr 2024 08:42:02 GMT
- Title: Causal Mode Multiplexer: A Novel Framework for Unbiased Multispectral Pedestrian Detection
- Authors: Taeheon Kim, Sebin Shin, Youngjoon Yu, Hak Gu Kim, Yong Man Ro,
- Abstract summary: Multispectral pedestrian detectors show poor generalization ability on examples beyond statistical correlation.
We propose a novel Causal Mode Multiplexer framework that effectively learns the causalities between multispectral inputs and predictions.
We construct a new dataset (ROTX-MP) to evaluate modality bias in multispectral pedestrian detection.
- Score: 47.00174564102467
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: RGBT multispectral pedestrian detection has emerged as a promising solution for safety-critical applications that require day/night operations. However, the modality bias problem remains unsolved as multispectral pedestrian detectors learn the statistical bias in datasets. Specifically, datasets in multispectral pedestrian detection mainly distribute between ROTO (day) and RXTO (night) data; the majority of the pedestrian labels statistically co-occur with their thermal features. As a result, multispectral pedestrian detectors show poor generalization ability on examples beyond this statistical correlation, such as ROTX data. To address this problem, we propose a novel Causal Mode Multiplexer (CMM) framework that effectively learns the causalities between multispectral inputs and predictions. Moreover, we construct a new dataset (ROTX-MP) to evaluate modality bias in multispectral pedestrian detection. ROTX-MP mainly includes ROTX examples not presented in previous datasets. Extensive experiments demonstrate that our proposed CMM framework generalizes well on existing datasets (KAIST, CVC-14, FLIR) and the new ROTX-MP. We will release our new dataset to the public for future research.
Related papers
- When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset [40.24765100535353]
This paper introduces MMPedestron, a novel generalist model for multimodal perception.
The proposed approach comprises a unified encoder for modal representation and fusion and a general head for pedestrian detection.
With multi-modal joint training, our model achieves state-of-the-art performance on a wide range of pedestrian detection benchmarks.
arXiv Detail & Related papers (2024-07-14T09:16:49Z) - MSCoTDet: Language-driven Multi-modal Fusion for Improved Multispectral Pedestrian Detection [44.35734602609513]
We investigate how to mitigate modality bias in multispectral pedestrian detection using Large Language Models.
We propose a novel Multispectral Chain-of-Thought Detection (MSCoTDet) framework that integrates MSCoT prompting into multispectral pedestrian detection.
arXiv Detail & Related papers (2024-03-22T13:50:27Z) - Source-Free Collaborative Domain Adaptation via Multi-Perspective
Feature Enrichment for Functional MRI Analysis [55.03872260158717]
Resting-state MRI functional (rs-fMRI) is increasingly employed in multi-site research to aid neurological disorder analysis.
Many methods have been proposed to reduce fMRI heterogeneity between source and target domains.
But acquiring source data is challenging due to concerns and/or data storage burdens in multi-site studies.
We design a source-free collaborative domain adaptation framework for fMRI analysis, where only a pretrained source model and unlabeled target data are accessible.
arXiv Detail & Related papers (2023-08-24T01:30:18Z) - MRCLens: an MRC Dataset Bias Detection Toolkit [82.44296974850639]
We introduce MRCLens, a toolkit that detects whether biases exist before users train the full model.
For the convenience of introducing the toolkit, we also provide a categorization of common biases in MRC.
arXiv Detail & Related papers (2022-07-18T21:05:39Z) - Inertial Hallucinations -- When Wearable Inertial Devices Start Seeing
Things [82.15959827765325]
We propose a novel approach to multimodal sensor fusion for Ambient Assisted Living (AAL)
We address two major shortcomings of standard multimodal approaches, limited area coverage and reduced reliability.
Our new framework fuses the concept of modality hallucination with triplet learning to train a model with different modalities to handle missing sensors at inference time.
arXiv Detail & Related papers (2022-07-14T10:04:18Z) - Multimedia Datasets for Anomaly Detection: A Survey [0.0]
This paper presents a comprehensive survey on a variety of video, audio, as well as audio-visual datasets based on anomaly detection.
It aims to address the lack of a comprehensive comparison and analysis of multimedia public datasets based on anomaly detection.
arXiv Detail & Related papers (2021-12-10T09:32:21Z) - Unsupervised Deep Anomaly Detection for Multi-Sensor Time-Series Signals [10.866594993485226]
We propose a novel deep learning-based anomaly detection algorithm called Deep Convolutional Autoencoding Memory network (CAE-M)
We first build a Deep Convolutional Autoencoder to characterize spatial dependence of multi-sensor data with a Maximum Mean Discrepancy (MMD)
Then, we construct a Memory Network consisting of linear (Autoregressive Model) and non-linear predictions (Bigressive LSTM with Attention) to capture temporal dependence from time-series data.
arXiv Detail & Related papers (2021-07-27T06:48:20Z) - TadGAN: Time Series Anomaly Detection Using Generative Adversarial
Networks [73.01104041298031]
TadGAN is an unsupervised anomaly detection approach built on Generative Adversarial Networks (GANs)
To capture the temporal correlations of time series, we use LSTM Recurrent Neural Networks as base models for Generators and Critics.
To demonstrate the performance and generalizability of our approach, we test several anomaly scoring techniques and report the best-suited one.
arXiv Detail & Related papers (2020-09-16T15:52:04Z) - Stance Detection Benchmark: How Robust Is Your Stance Detection? [65.91772010586605]
Stance Detection (StD) aims to detect an author's stance towards a certain topic or claim.
We introduce a StD benchmark that learns from ten StD datasets of various domains in a multi-dataset learning setting.
Within this benchmark setup, we are able to present new state-of-the-art results on five of the datasets.
arXiv Detail & Related papers (2020-01-06T13:37:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.