Few to Big: Prototype Expansion Network via Diffusion Learner for Point Cloud Few-shot Semantic Segmentation
- URL: http://arxiv.org/abs/2509.12878v1
- Date: Tue, 16 Sep 2025 09:29:46 GMT
- Title: Few to Big: Prototype Expansion Network via Diffusion Learner for Point Cloud Few-shot Semantic Segmentation
- Authors: Qianguang Zhao, Dongli Wang, Yan Zhou, Jianxun Li, Richard Irampa,
- Abstract summary: Prototype Expansion Network (PENet) is a framework that constructs big-capacity prototypes from two annotated feature sources.<n>PENet significantly outperforms state-of-the-art methods across various few-shot settings.
- Score: 12.971351926107289
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-shot 3D point cloud semantic segmentation aims to segment novel categories using a minimal number of annotated support samples. While existing prototype-based methods have shown promise, they are constrained by two critical challenges: (1) Intra-class Diversity, where a prototype's limited representational capacity fails to cover a class's full variations, and (2) Inter-set Inconsistency, where prototypes derived from the support set are misaligned with the query feature space. Motivated by the powerful generative capability of diffusion model, we re-purpose its pre-trained conditional encoder to provide a novel source of generalizable features for expanding the prototype's representational range. Under this setup, we introduce the Prototype Expansion Network (PENet), a framework that constructs big-capacity prototypes from two complementary feature sources. PENet employs a dual-stream learner architecture: it retains a conventional fully supervised Intrinsic Learner (IL) to distill representative features, while introducing a novel Diffusion Learner (DL) to provide rich generalizable features. The resulting dual prototypes are then processed by a Prototype Assimilation Module (PAM), which adopts a novel push-pull cross-guidance attention block to iteratively align the prototypes with the query space. Furthermore, a Prototype Calibration Mechanism (PCM) regularizes the final big capacity prototype to prevent semantic drift. Extensive experiments on the S3DIS and ScanNet datasets demonstrate that PENet significantly outperforms state-of-the-art methods across various few-shot settings.
Related papers
- Query-aware Hub Prototype Learning for Few-Shot 3D Point Cloud Semantic Segmentation [15.238614809926936]
Few-shot 3D point cloud semantic segmentation (FS-3DSeg) aims to segment novel classes with only a few labeled samples.<n>Existing metric-based prototype learning methods generate prototypes solely from the support set, without considering their relevance to query data.<n>We propose a novel Query-aware Hub Prototype (QHP) learning method that explicitly models semantic correlations between support and query sets.
arXiv Detail & Related papers (2025-12-09T05:18:30Z) - Divide, Conquer and Unite: Hierarchical Style-Recalibrated Prototype Alignment for Federated Medical Image Segmentation [66.82598255715696]
Federated learning enables multiple medical institutions to train a global model without sharing data.<n>Current approaches primarily focus on final-layer features, overlooking critical multi-level cues.<n>We propose FedBCS to bridge feature representation gaps via domain-invariant contextual prototypes alignment.
arXiv Detail & Related papers (2025-11-14T04:15:34Z) - DPL: Spatial-Conditioned Diffusion Prototype Enhancement for One-Shot Medical Segmentation [0.0]
One-shot medical image segmentation faces fundamental challenges in prototype representation due to limited annotated data and anatomical variability across patients.<n>Traditional prototype-based methods rely on deterministic averaging of support features, creating brittle representations that fail to capture intra-class diversity essential for robust generalization.<n>This work introduces Diffusion Prototype Learning, a novel framework that reformulates prototype construction through diffusion-based feature space exploration.
arXiv Detail & Related papers (2025-10-14T05:28:58Z) - Probabilistic Prototype Calibration of Vision-Language Models for Generalized Few-shot Semantic Segmentation [75.18058114915327]
Generalized Few-Shot Semanticnative (GFSS) aims to extend a segmentation model to novel classes with only a few annotated examples.<n>We propose FewCLIP, a probabilistic prototype calibration framework over multi-modal prototypes from the pretrained CLIP.<n>We show FewCLIP significantly outperforms state-of-the-art approaches across both GFSS and class-incremental setting.
arXiv Detail & Related papers (2025-06-28T18:36:22Z) - Proto-FG3D: Prototype-based Interpretable Fine-Grained 3D Shape Classification [59.68055837500357]
We propose the first prototype-based framework named Proto-FG3D for fine-grained 3D shape classification.<n>Proto-FG3D establishes joint multi-view and multi-category representation learning via Prototype Association.<n>Proto-FG3D surpasses state-of-the-art methods in accuracy, transparent predictions, and ad-hoc interpretability with visualizations.
arXiv Detail & Related papers (2025-05-23T09:31:02Z) - A Deep Positive-Negative Prototype Approach to Integrated Prototypical Discriminative Learning [0.30693357740321775]
This paper proposes a novel Deep Positive-Negative Prototype (DPNP) model that combines prototype-based learning (PbL) with discriminative methods to improve class compactness and separability in deep neural networks.<n>We show that DPNP can organize prototypes in nearly regular positions within feature space, such that it is possible to achieve competitive classification accuracy even in much lower-dimensional feature spaces.
arXiv Detail & Related papers (2025-01-05T08:24:31Z) - Query-guided Prototype Evolution Network for Few-Shot Segmentation [85.75516116674771]
We present a new method that integrates query features into the generation process of foreground and background prototypes.<n> Experimental results on the PASCAL-$5i$ and mirroring-$20i$ datasets attest to the substantial enhancements achieved by QPENet.
arXiv Detail & Related papers (2024-03-11T07:50:40Z) - Multimodal Prototype-Enhanced Network for Few-Shot Action Recognition [40.329190454146996]
MultimOdal PRototype-ENhanced Network (MORN) uses semantic information of label texts as multimodal information to enhance prototypes.
We conduct extensive experiments on four popular few-shot action recognition datasets.
arXiv Detail & Related papers (2022-12-09T14:24:39Z) - Few-Shot Segmentation via Rich Prototype Generation and Recurrent
Prediction Enhancement [12.614578133091168]
We propose a rich prototype generation module (RPGM) and a recurrent prediction enhancement module (RPEM) to reinforce the prototype learning paradigm.
RPGM combines superpixel and K-means clustering to generate rich prototype features with complementary scale relationships.
RPEM utilizes the recurrent mechanism to design a round-way propagation decoder.
arXiv Detail & Related papers (2022-10-03T08:46:52Z) - Dual Prototypical Contrastive Learning for Few-shot Semantic
Segmentation [55.339405417090084]
We propose a dual prototypical contrastive learning approach tailored to the few-shot semantic segmentation (FSS) task.
The main idea is to encourage the prototypes more discriminative by increasing inter-class distance while reducing intra-class distance in prototype feature space.
We demonstrate that the proposed dual contrastive learning approach outperforms state-of-the-art FSS methods on PASCAL-5i and COCO-20i datasets.
arXiv Detail & Related papers (2021-11-09T08:14:50Z) - Prototype Completion for Few-Shot Learning [13.63424509914303]
Few-shot learning aims to recognize novel classes with few examples.
Pre-training based methods effectively tackle the problem by pre-training a feature extractor and then fine-tuning it through the nearest centroid based meta-learning.
We propose a novel prototype completion based meta-learning framework.
arXiv Detail & Related papers (2021-08-11T03:44:00Z) - End-to-end One-shot Human Parsing [91.5113227694443]
One-shot human parsing (OSHP) task requires parsing humans into an open set of classes defined by any test example.
End-to-end One-shot human Parsing Network (EOP-Net) proposed.
EOP-Net outperforms representative one-shot segmentation models by large margins.
arXiv Detail & Related papers (2021-05-04T01:35:50Z) - Part-aware Prototype Network for Few-shot Semantic Segmentation [50.581647306020095]
We propose a novel few-shot semantic segmentation framework based on the prototype representation.
Our key idea is to decompose the holistic class representation into a set of part-aware prototypes.
We develop a novel graph neural network model to generate and enhance the proposed part-aware prototypes.
arXiv Detail & Related papers (2020-07-13T11:03:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.