MEDOE: A Multi-Expert Decoder and Output Ensemble Framework for
Long-tailed Semantic Segmentation
- URL: http://arxiv.org/abs/2308.08213v1
- Date: Wed, 16 Aug 2023 08:30:44 GMT
- Title: MEDOE: A Multi-Expert Decoder and Output Ensemble Framework for
Long-tailed Semantic Segmentation
- Authors: Junao Shen, Long Chen, Kun Kuang, Fei Wu, Tian Feng, Wei Zhang
- Abstract summary: Long-tailed distribution of semantic categories causes unsatisfactory performance in semantic segmentation on tail categories.
We propose MEDOE, a novel framework for long-tailed semantic segmentation via contextual information ensemble-and-grouping.
Experimental results show that the proposed framework outperforms the current methods on both Cityscapes and ADE20K datasets.
- Score: 36.03023287593103
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Long-tailed distribution of semantic categories, which has been often ignored
in conventional methods, causes unsatisfactory performance in semantic
segmentation on tail categories. In this paper, we focus on the problem of
long-tailed semantic segmentation. Although some long-tailed recognition
methods (e.g., re-sampling/re-weighting) have been proposed in other problems,
they can probably compromise crucial contextual information and are thus hardly
adaptable to the problem of long-tailed semantic segmentation. To address this
issue, we propose MEDOE, a novel framework for long-tailed semantic
segmentation via contextual information ensemble-and-grouping. The proposed
two-sage framework comprises a multi-expert decoder (MED) and a multi-expert
output ensemble (MOE). Specifically, the MED includes several "experts". Based
on the pixel frequency distribution, each expert takes the dataset masked
according to the specific categories as input and generates contextual
information self-adaptively for classification; The MOE adopts learnable
decision weights for the ensemble of the experts' outputs. As a model-agnostic
framework, our MEDOE can be flexibly and efficiently coupled with various
popular deep neural networks (e.g., DeepLabv3+, OCRNet, and PSPNet) to improve
their performance in long-tailed semantic segmentation. Experimental results
show that the proposed framework outperforms the current methods on both
Cityscapes and ADE20K datasets by up to 1.78% in mIoU and 5.89% in mAcc.
Related papers
- Segment and Matte Anything in a Unified Model [5.8874968768571625]
Segment Anything (SAM) has recently pushed the boundaries of segmentation by demonstrating zero-shot generalization and flexible prompting.<n>We introduce Segment And Matte Anything (SAMA), a lightweight extension of SAM that delivers high-quality interactive image segmentation and matting.
arXiv Detail & Related papers (2026-01-17T19:43:10Z) - Generalized Fine-Grained Category Discovery with Multi-Granularity Conceptual Experts [81.68203255687051]
Generalized Category Discovery is an open-world problem that clusters unlabeled data by leveraging knowledge from partially labeled categories.<n>Existing approaches fail to exploit multi-granularity conceptual information in visual data.<n>We propose a Multi-Granularity Experts framework that integrates multi-granularity knowledge for accurate category discovery.
arXiv Detail & Related papers (2025-09-30T13:25:11Z) - Semi-MoE: Mixture-of-Experts meets Semi-Supervised Histopathology Segmentation [13.530424405137417]
Semi-supervised learning has been employed to alleviate the need for extensive labeled data for histopathology image segmentation.<n>Existing methods struggle with noisy pseudo-labels due to ambiguous gland boundaries and morphological misclassification.<n>This paper introduces Semi-MOE, to the best of our knowledge, the first multi-task Mixture-of-Experts framework for semi-supervised histopathology image segmentation.
arXiv Detail & Related papers (2025-09-17T09:03:04Z) - DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model [10.4137698020509]
variability remains a substantial challenge in medical image segmentation, stemming from ambiguous imaging boundaries and diverse clinical expertise.<n>We propose DiffOSeg, a two-stage diffusion-based framework, which aims to simultaneously achieve both consensus-driven and preference-driven segmentation.<n>Our model outperforms existing state-of-the-art methods across all evaluated metrics.
arXiv Detail & Related papers (2025-07-17T12:57:27Z) - Cross-Domain Semantic Segmentation with Large Language Model-Assisted Descriptor Generation [0.0]
LangSeg is a novel semantic segmentation method that leverages context-sensitive, fine-grained subclass descriptors.
We evaluate LangSeg on two challenging datasets, ADE20K and COCO-Stuff, where it outperforms state-of-the-art models.
arXiv Detail & Related papers (2025-01-27T20:02:12Z) - Frequency-based Matcher for Long-tailed Semantic Segmentation [22.199174076366003]
We focus on a relatively under-explored task setting, long-tailed semantic segmentation (LTSS)
We propose a dual-metric evaluation system and construct the LTSS benchmark to demonstrate the performance of semantic segmentation methods and long-tailed solutions.
We also propose a transformer-based algorithm to improve LTSS, frequency-based matcher, which solves the oversuppression problem by one-to-many matching.
arXiv Detail & Related papers (2024-06-06T09:57:56Z) - Universal Segmentation at Arbitrary Granularity with Language
Instruction [59.76130089644841]
We present UniLSeg, a universal segmentation model that can perform segmentation at any semantic level with the guidance of language instructions.
For training UniLSeg, we reorganize a group of tasks from original diverse distributions into a unified data format, where images with texts describing segmentation targets as input and corresponding masks are output.
arXiv Detail & Related papers (2023-12-04T04:47:48Z) - Inter-Rater Uncertainty Quantification in Medical Image Segmentation via
Rater-Specific Bayesian Neural Networks [7.642026462053574]
We introduce a novel Bayesian neural network-based architecture to estimate inter-rater uncertainty in medical image segmentation.
Firstly, we introduce a one-encoder-multi-decoder architecture specifically tailored for uncertainty estimation.
Secondly, we propose Bayesian modeling for the new architecture, allowing efficient capture of the inter-rater distribution.
arXiv Detail & Related papers (2023-06-28T20:52:51Z) - Object Segmentation by Mining Cross-Modal Semantics [68.88086621181628]
We propose a novel approach by mining the Cross-Modal Semantics to guide the fusion and decoding of multimodal features.
Specifically, we propose a novel network, termed XMSNet, consisting of (1) all-round attentive fusion (AF), (2) coarse-to-fine decoder (CFD), and (3) cross-layer self-supervision.
arXiv Detail & Related papers (2023-05-17T14:30:11Z) - An Efficient Multi-Scale Fusion Network for 3D Organ at Risk (OAR)
Segmentation [2.6770199357488242]
We propose a new OAR segmentation framework called OARFocalFuseNet.
It fuses multi-scale features and employs focal modulation for capturing global-local context across multiple scales.
Our best performing method (OARFocalFuseNet) obtained a dice coefficient of 0.7995 and hausdorff distance of 5.1435 on OpenKBP datasets.
arXiv Detail & Related papers (2022-08-15T19:40:18Z) - AF$_2$: Adaptive Focus Framework for Aerial Imagery Segmentation [86.44683367028914]
Aerial imagery segmentation has some unique challenges, the most critical one among which lies in foreground-background imbalance.
We propose Adaptive Focus Framework (AF$), which adopts a hierarchical segmentation procedure and focuses on adaptively utilizing multi-scale representations.
AF$ has significantly improved the accuracy on three widely used aerial benchmarks, as fast as the mainstream method.
arXiv Detail & Related papers (2022-02-18T10:14:45Z) - CTNet: Context-based Tandem Network for Semantic Segmentation [77.4337867789772]
This work proposes a novel Context-based Tandem Network (CTNet) by interactively exploring the spatial contextual information and the channel contextual information.
To further improve the performance of the learned representations for semantic segmentation, the results of the two context modules are adaptively integrated.
arXiv Detail & Related papers (2021-04-20T07:33:11Z) - Three Ways to Improve Semantic Segmentation with Self-Supervised Depth
Estimation [90.87105131054419]
We present a framework for semi-supervised semantic segmentation, which is enhanced by self-supervised monocular depth estimation from unlabeled image sequences.
We validate the proposed model on the Cityscapes dataset, where all three modules demonstrate significant performance gains.
arXiv Detail & Related papers (2020-12-19T21:18:03Z) - Boundary-aware Context Neural Network for Medical Image Segmentation [15.585851505721433]
Medical image segmentation can provide reliable basis for further clinical analysis and disease diagnosis.
Most existing CNNs-based methods produce unsatisfactory segmentation mask without accurate object boundaries.
In this paper, we formulate a boundary-aware context neural network (BA-Net) for 2D medical image segmentation.
arXiv Detail & Related papers (2020-05-03T02:35:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.