Related papers: MOS: Model Synergy for Test-Time Adaptation on LiDAR-Based 3D Object Detection

MOS: Model Synergy for Test-Time Adaptation on LiDAR-Based 3D Object Detection

URL: http://arxiv.org/abs/2406.14878v3
Date: Sun, 09 Feb 2025 10:54:30 GMT
Title: MOS: Model Synergy for Test-Time Adaptation on LiDAR-Based 3D Object Detection
Authors: Zhuoxiao Chen, Junjie Meng, Mahsa Baktashmotlagh, Yonggang Zhang, Zi Huang, Yadan Luo,
Abstract summary: We propose a novel online test-time adaptation framework for 3D detectors.<n>By leveraging long-term knowledge from previous test batches, our approach mitigates catastrophic forgetting and adapts effectively to diverse shifts.<n>Our method was rigorously tested against existing test-time adaptation strategies across three datasets and eight types of corruptions.
Score: 38.6421466851974
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: LiDAR-based 3D object detection is crucial for various applications but often experiences performance degradation in real-world deployments due to domain shifts. While most studies focus on cross-dataset shifts, such as changes in environments and object geometries, practical corruptions from sensor variations and weather conditions remain underexplored. In this work, we propose a novel online test-time adaptation framework for 3D detectors that effectively tackles these shifts, including a challenging cross-corruption scenario where cross-dataset shifts and corruptions co-occur. By leveraging long-term knowledge from previous test batches, our approach mitigates catastrophic forgetting and adapts effectively to diverse shifts. Specifically, we propose a Model Synergy (MOS) strategy that dynamically selects historical checkpoints with diverse knowledge and assembles them to best accommodate the current test batch. This assembly is directed by our proposed Synergy Weights (SW), which perform a weighted averaging of the selected checkpoints, minimizing redundancy in the composite model. The SWs are computed by evaluating the similarity of predicted bounding boxes on the test data and the independence of features between checkpoint pairs in the model bank. To maintain an efficient and informative model bank, we discard checkpoints with the lowest average SW scores, replacing them with newly updated models. Our method was rigorously tested against existing test-time adaptation strategies across three datasets and eight types of corruptions, demonstrating superior adaptability to dynamic scenes and conditions. Notably, it achieved a 67.3% improvement in a challenging cross-corruption scenario, offering a more comprehensive benchmark for adaptation. Source code: https://github.com/zhuoxiao-chen/MOS.

Related papers

CodeMerge: Codebook-Guided Model Merging for Robust Test-Time Adaptation in Autonomous Driving [28.022501313260648]
Existing test-time adaptation methods often fail in high-variance tasks like 3D object detection due to unstable optimization and sharp minima.<n>We introduce CodeMerge, a scalable model merging framework that bypasses these limitations by operating in a compact latent space.<n>Our method achieves strong performance across challenging benchmarks, improving end-to-end 3D detection 14.9% NDS on nuScenes-C and LiDAR-based detection by over 7.6% mAP.
arXiv Detail & Related papers (2025-05-22T11:09:15Z)
APCoTTA: Continual Test-Time Adaptation for Semantic Segmentation of Airborne LiDAR Point Clouds [14.348191795901101]
Airborne laser scanning (ALS) point cloud segmentation is a fundamental task for large-scale 3D scene understanding.<n> Continuous Test-Time Adaptation (CTTA) offers a solution by adapting a source-pretrained model to evolving, unlabeled target domains.<n>We propose APCoTTA, the first CTTA method tailored for ALS point cloud semantic segmentation.
arXiv Detail & Related papers (2025-05-15T05:21:16Z)
SKADA-Bench: Benchmarking Unsupervised Domain Adaptation Methods with Realistic Validation [55.87169702896249]
Unsupervised Domain Adaptation (DA) consists of adapting a model trained on a labeled source domain to perform well on an unlabeled target domain with some data distribution shift. We propose a framework to evaluate DA methods and present a fair evaluation of existing shallow algorithms, including reweighting, mapping, and subspace alignment. Our benchmark highlights the importance of realistic validation and provides practical guidance for real-life applications.
arXiv Detail & Related papers (2024-07-16T12:52:29Z)
Approaching Outside: Scaling Unsupervised 3D Object Detection from 2D Scene [22.297964850282177]
We propose LiDAR-2D Self-paced Learning (LiSe) for unsupervised 3D detection. RGB images serve as a valuable complement to LiDAR data, offering precise 2D localization cues. Our framework devises a self-paced learning pipeline that incorporates adaptive sampling and weak model aggregation strategies.
arXiv Detail & Related papers (2024-07-11T14:58:49Z)
Find n' Propagate: Open-Vocabulary 3D Object Detection in Urban Environments [67.83787474506073]
We tackle the limitations of current LiDAR-based 3D object detection systems. We introduce a universal textscFind n' Propagate approach for 3D OV tasks. We achieve up to a 3.97-fold increase in Average Precision (AP) for novel object classes.
arXiv Detail & Related papers (2024-03-20T12:51:30Z)
What, How, and When Should Object Detectors Update in Continually Changing Test Domains? [34.13756022890991]
Test-time adaptation algorithms have been proposed to adapt the model online while inferring test data. We propose a novel online adaption approach for object detection in continually changing test domains. Our approach surpasses baselines on widely used benchmarks, achieving improvements of up to 4.9%p and 7.9%p in mAP.
arXiv Detail & Related papers (2023-12-12T07:13:08Z)
Exchanging Dual Encoder-Decoder: A New Strategy for Change Detection with Semantic Guidance and Spatial Localization [10.059696915598392]
We propose a new strategy with an exchanging dual encoder-decoder structure for binary change detection with semantic guidance and spatial localization. We build a binary change detection model based on this strategy, and then validate and compare it with 18 state-of-the-art change detection methods on six datasets.
arXiv Detail & Related papers (2023-11-19T11:30:43Z)
Revisiting Domain-Adaptive 3D Object Detection by Reliable, Diverse and Class-balanced Pseudo-Labeling [38.07637524378327]
Unsupervised domain adaptation (DA) with the aid of pseudo labeling techniques has emerged as a crucial approach for domain-adaptive 3D object detection. Existing DA methods suffer from a substantial drop in performance when applied to a multi-class training setting. We propose a novel ReDB framework tailored for learning to detect all classes at once.
arXiv Detail & Related papers (2023-07-16T04:34:11Z)
On Pitfalls of Test-Time Adaptation [82.8392232222119]
Test-Time Adaptation (TTA) has emerged as a promising approach for tackling the robustness challenge under distribution shifts. We present TTAB, a test-time adaptation benchmark that encompasses ten state-of-the-art algorithms, a diverse array of distribution shifts, and two evaluation protocols.
arXiv Detail & Related papers (2023-06-06T09:35:29Z)
Efficient Test-Time Model Adaptation without Forgetting [60.36499845014649]
Test-time adaptation seeks to tackle potential distribution shifts between training and testing data. We propose an active sample selection criterion to identify reliable and non-redundant samples. We also introduce a Fisher regularizer to constrain important model parameters from drastic changes.
arXiv Detail & Related papers (2022-04-06T06:39:40Z)
Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D Object Detection [85.11649974840758]
3D object detection networks tend to be biased towards the data they are trained on. We propose a single-frame approach for source-free, unsupervised domain adaptation of lidar-based 3D object detectors.
arXiv Detail & Related papers (2021-11-30T18:42:42Z)
When Liebig's Barrel Meets Facial Landmark Detection: A Practical Model [87.25037167380522]
We propose a model that is accurate, robust, efficient, generalizable, and end-to-end trainable. In order to achieve a better accuracy, we propose two lightweight modules. DQInit dynamically initializes the queries of decoder from the inputs, enabling the model to achieve as good accuracy as the ones with multiple decoder layers. QAMem is designed to enhance the discriminative ability of queries on low-resolution feature maps by assigning separate memory values to each query rather than a shared one.
arXiv Detail & Related papers (2021-05-27T13:51:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.