Rein++: Efficient Generalization and Adaptation for Semantic Segmentation with Vision Foundation Models
- URL: http://arxiv.org/abs/2508.01667v1
- Date: Sun, 03 Aug 2025 08:53:30 GMT
- Title: Rein++: Efficient Generalization and Adaptation for Semantic Segmentation with Vision Foundation Models
- Authors: Zhixiang Wei, Xiaoxiao Ma, Ruishen Yan, Tao Tu, Huaian Chen, Jinjin Zheng, Yi Jin, Enhong Chen,
- Abstract summary: Rein++ is an efficient VFM-based segmentation framework.<n>It demonstrates superior generalization from limited data.<n>It enables effective adaptation to diverse unlabeled scenarios.
- Score: 47.66611300605174
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Vision Foundation Models(VFMs) have achieved remarkable success in various computer vision tasks. However, their application to semantic segmentation is hindered by two significant challenges: (1) the disparity in data scale, as segmentation datasets are typically much smaller than those used for VFM pre-training, and (2) domain distribution shifts, where real-world segmentation scenarios are diverse and often underrepresented during pre-training. To overcome these limitations, we present Rein++, an efficient VFM-based segmentation framework that demonstrates superior generalization from limited data and enables effective adaptation to diverse unlabeled scenarios. Specifically, Rein++ comprises a domain generalization solution Rein-G and a domain adaptation solution Rein-A. Rein-G introduces a set of trainable, instance-aware tokens that effectively refine the VFM's features for the segmentation task. This parameter-efficient approach fine-tunes less than 1% of the backbone's parameters, enabling robust generalization. Building on the Rein-G, Rein-A performs unsupervised domain adaptation at both the instance and logit levels to mitigate domain shifts. In addition, it incorporates a semantic transfer module that leverages the class-agnostic capabilities of the segment anything model to enhance boundary details in the target domain. The integrated Rein++ pipeline first learns a generalizable model on a source domain (e.g., daytime scenes) and subsequently adapts it to diverse target domains (e.g., nighttime scenes) without any target labels. Comprehensive experiments demonstrate that Rein++ significantly outperforms state-of-the-art methods with efficient training, underscoring its roles an efficient, generalizable, and adaptive segmentation solution for VFMs, even for large models with billions of parameters. The code is available at https://github.com/wloves/Rein.
Related papers
- Generalize or Detect? Towards Robust Semantic Segmentation Under Multiple Distribution Shifts [56.57141696245328]
In open-world scenarios, where both novel classes and domains may exist, an ideal segmentation model should detect anomaly classes for safety.
Existing methods often struggle to distinguish between domain-level and semantic-level distribution shifts.
arXiv Detail & Related papers (2024-11-06T11:03:02Z) - Stronger, Fewer, & Superior: Harnessing Vision Foundation Models for Domain Generalized Semantic Segmentation [16.12797115277337]
We first assess and harness various Vision Foundation Models (VFMs) in the context of Domain Generalized Semantic (DGSS)
We introduce a robust fine-tuning approach, namely Rein, to parameter-efficiently harness VFMs for DGSS.
With fewer trainable parameters, Rein efficiently fine-tunes VFMs for DGSS tasks, surprisingly surpassing full parameter fine-tuning.
arXiv Detail & Related papers (2023-12-07T12:43:00Z) - Frequency-mixed Single-source Domain Generalization for Medical Image
Segmentation [29.566769388674473]
The scarcity of medical image segmentation poses challenges in collecting sufficient training data for deep learning models.
We propose a novel approach called the Frequency-mixed Single-source Domain Generalization method (FreeSDG)
Experimental results on five datasets of three modalities demonstrate the effectiveness of the proposed algorithm.
arXiv Detail & Related papers (2023-07-18T06:44:45Z) - HGFormer: Hierarchical Grouping Transformer for Domain Generalized
Semantic Segmentation [113.6560373226501]
This work studies semantic segmentation under the domain generalization setting.
We propose a novel hierarchical grouping transformer (HGFormer) to explicitly group pixels to form part-level masks and then whole-level masks.
Experiments show that HGFormer yields more robust semantic segmentation results than per-pixel classification methods and flat grouping transformers.
arXiv Detail & Related papers (2023-05-22T13:33:41Z) - SALUDA: Surface-based Automotive Lidar Unsupervised Domain Adaptation [62.889835139583965]
We introduce an unsupervised auxiliary task of learning an implicit underlying surface representation simultaneously on source and target data.
As both domains share the same latent representation, the model is forced to accommodate discrepancies between the two sources of data.
Our experiments demonstrate that our method achieves a better performance than the current state of the art, both in real-to-real and synthetic-to-real scenarios.
arXiv Detail & Related papers (2023-04-06T17:36:23Z) - Semi-supervised Meta-learning with Disentanglement for
Domain-generalised Medical Image Segmentation [15.351113774542839]
Generalising models to new data from new centres (termed here domains) remains a challenge.
We propose a novel semi-supervised meta-learning framework with disentanglement.
We show that the proposed method is robust on different segmentation tasks and achieves state-of-the-art generalisation performance on two public benchmarks.
arXiv Detail & Related papers (2021-06-24T19:50:07Z) - RobustNet: Improving Domain Generalization in Urban-Scene Segmentation
via Instance Selective Whitening [40.98892593362837]
Enhancing generalization capability of deep neural networks to unseen domains is crucial for safety-critical applications in the real world such as autonomous driving.
This paper proposes a novel instance selective whitening loss to improve the robustness of the segmentation networks for unseen domains.
arXiv Detail & Related papers (2021-03-29T13:19:37Z) - Towards Adaptive Semantic Segmentation by Progressive Feature Refinement [16.40758125170239]
We propose an innovative progressive feature refinement framework, along with domain adversarial learning to boost the transferability of segmentation networks.
As a result, the segmentation models trained with source domain images can be transferred to a target domain without significant performance degradation.
arXiv Detail & Related papers (2020-09-30T04:17:48Z) - Alleviating Semantic-level Shift: A Semi-supervised Domain Adaptation
Method for Semantic Segmentation [97.8552697905657]
A key challenge of this task is how to alleviate the data distribution discrepancy between the source and target domains.
We propose Alleviating Semantic-level Shift (ASS), which can successfully promote the distribution consistency from both global and local views.
We apply our ASS to two domain adaptation tasks, from GTA5 to Cityscapes and from Synthia to Cityscapes.
arXiv Detail & Related papers (2020-04-02T03:25:05Z) - Supervised Domain Adaptation using Graph Embedding [86.3361797111839]
Domain adaptation methods assume that distributions between the two domains are shifted and attempt to realign them.
We propose a generic framework based on graph embedding.
We show that the proposed approach leads to a powerful Domain Adaptation framework.
arXiv Detail & Related papers (2020-03-09T12:25:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.