Semantic Library Adaptation: LoRA Retrieval and Fusion for Open-Vocabulary Semantic Segmentation
- URL: http://arxiv.org/abs/2503.21780v1
- Date: Thu, 27 Mar 2025 17:59:58 GMT
- Title: Semantic Library Adaptation: LoRA Retrieval and Fusion for Open-Vocabulary Semantic Segmentation
- Authors: Reza Qorbani, Gianluca Villani, Theodoros Panagiotakopoulos, Marc Botet Colomer, Linus Härenstam-Nielsen, Mattia Segu, Pier Luigi Dovesi, Jussi Karlgren, Daniel Cremers, Federico Tombari, Matteo Poggi,
- Abstract summary: Open-vocabulary semantic segmentation models associate vision and text to label pixels from an undefined set of classes using textual queries.<n>We introduce Semantic Library Adaptation (SemLA), a novel framework for training-free, test-time domain adaptation.
- Score: 72.28364940168092
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Open-vocabulary semantic segmentation models associate vision and text to label pixels from an undefined set of classes using textual queries, providing versatile performance on novel datasets. However, large shifts between training and test domains degrade their performance, requiring fine-tuning for effective real-world applications. We introduce Semantic Library Adaptation (SemLA), a novel framework for training-free, test-time domain adaptation. SemLA leverages a library of LoRA-based adapters indexed with CLIP embeddings, dynamically merging the most relevant adapters based on proximity to the target domain in the embedding space. This approach constructs an ad-hoc model tailored to each specific input without additional training. Our method scales efficiently, enhances explainability by tracking adapter contributions, and inherently protects data privacy, making it ideal for sensitive applications. Comprehensive experiments on a 20-domain benchmark built over 10 standard datasets demonstrate SemLA's superior adaptability and performance across diverse settings, establishing a new standard in domain adaptation for open-vocabulary semantic segmentation.
Related papers
- VLMs meet UDA: Boosting Transferability of Open Vocabulary Segmentation with Unsupervised Domain Adaptation [3.776249047528669]
This paper proposes enhancing segmentation accuracy across diverse domains by integrating Vision-Language reasoning with key strategies for Unsupervised Domain Adaptation (UDA)<n>We improve the fine-grained segmentation capabilities of VLMs through multi-scale contextual data, robust text embeddings with prompt augmentation, and layer-wise fine-tuning in our proposed Foundational-Retaining Open Vocabulary (FROVSS) framework.<n>The resulting UDA-FROV framework is the first UDA approach to effectively adapt across domains without requiring shared categories.
arXiv Detail & Related papers (2024-12-12T12:49:42Z) - Text-Video Retrieval with Global-Local Semantic Consistent Learning [122.15339128463715]
We propose a simple yet effective method, Global-Local Semantic Consistent Learning (GLSCL)
GLSCL capitalizes on latent shared semantics across modalities for text-video retrieval.
Our method achieves comparable performance with SOTA as well as being nearly 220 times faster in terms of computational cost.
arXiv Detail & Related papers (2024-05-21T11:59:36Z) - CLIPArTT: Adaptation of CLIP to New Domains at Test Time [19.0284321951354]
We introduce CLIP Adaptation duRing Test-Time (CLIPArTT), a fully test-time adaptation (TTA) approach for pre-trained vision-language models (VLMs)<n>Our method employs a unique, minimally invasive text prompt tuning process, wherein multiple predicted classes are aggregated into a single new text prompt, used as emphpseudo label to re-classify inputs.<n>Our findings demonstrate that, without requiring additional transformations nor new trainable modules, CLIPArTT enhances performance dynamically across non-corrupted datasets.
arXiv Detail & Related papers (2024-05-01T07:24:30Z) - Unified Language-driven Zero-shot Domain Adaptation [55.64088594551629]
Unified Language-driven Zero-shot Domain Adaptation (ULDA) is a novel task setting.
It enables a single model to adapt to diverse target domains without explicit domain-ID knowledge.
arXiv Detail & Related papers (2024-04-10T16:44:11Z) - Divide and Adapt: Active Domain Adaptation via Customized Learning [56.79144758380419]
We present Divide-and-Adapt (DiaNA), a new ADA framework that partitions the target instances into four categories with stratified transferable properties.
With a novel data subdivision protocol based on uncertainty and domainness, DiaNA can accurately recognize the most gainful samples.
Thanks to the "divideand-adapt" spirit, DiaNA can handle data with large variations of domain gap.
arXiv Detail & Related papers (2023-07-21T14:37:17Z) - AdapterSoup: Weight Averaging to Improve Generalization of Pretrained
Language Models [127.04370753583261]
Pretrained language models (PLMs) are trained on massive corpora, but often need to specialize to specific domains.
A solution is to use a related-domain adapter for the novel domain at test time.
We introduce AdapterSoup, an approach that performs weight-space averaging of adapters trained on different domains.
arXiv Detail & Related papers (2023-02-14T13:09:23Z) - Unsupervised Contrastive Domain Adaptation for Semantic Segmentation [75.37470873764855]
We introduce contrastive learning for feature alignment in cross-domain adaptation.
The proposed approach consistently outperforms state-of-the-art methods for domain adaptation.
It achieves 60.2% mIoU on the Cityscapes dataset.
arXiv Detail & Related papers (2022-04-18T16:50:46Z) - Adapting Segmentation Networks to New Domains by Disentangling Latent
Representations [14.050836886292869]
Domain adaptation approaches have come into play to transfer knowledge acquired on a label-abundant source domain to a related label-scarce target domain.
We propose a novel performance metric to capture the relative efficacy of an adaptation strategy compared to supervised training.
arXiv Detail & Related papers (2021-08-06T09:43:07Z) - Semi-Supervised Domain Adaptation via Adaptive and Progressive Feature
Alignment [32.77436219094282]
SSDAS employs a few labeled target samples as anchors for adaptive and progressive feature alignment between labeled source samples and unlabeled target samples.
In addition, we replace the dissimilar source features by high-confidence target features continuously during the iterative training process.
Extensive experiments show the proposed SSDAS greatly outperforms a number of baselines.
arXiv Detail & Related papers (2021-06-05T09:12:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.