Related papers: Few to Big: Prototype Expansion Network via Diffusion Learner for Point Cloud Few-shot Semantic Segmentation

Few to Big: Prototype Expansion Network via Diffusion Learner for Point Cloud Few-shot Semantic Segmentation

URL: http://arxiv.org/abs/2509.12878v1
Date: Tue, 16 Sep 2025 09:29:46 GMT
Title: Few to Big: Prototype Expansion Network via Diffusion Learner for Point Cloud Few-shot Semantic Segmentation
Authors: Qianguang Zhao, Dongli Wang, Yan Zhou, Jianxun Li, Richard Irampa,
Abstract summary: Prototype Expansion Network (PENet) is a framework that constructs big-capacity prototypes from two annotated feature sources.<n>PENet significantly outperforms state-of-the-art methods across various few-shot settings.
Score: 12.971351926107289
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Few-shot 3D point cloud semantic segmentation aims to segment novel categories using a minimal number of annotated support samples. While existing prototype-based methods have shown promise, they are constrained by two critical challenges: (1) Intra-class Diversity, where a prototype's limited representational capacity fails to cover a class's full variations, and (2) Inter-set Inconsistency, where prototypes derived from the support set are misaligned with the query feature space. Motivated by the powerful generative capability of diffusion model, we re-purpose its pre-trained conditional encoder to provide a novel source of generalizable features for expanding the prototype's representational range. Under this setup, we introduce the Prototype Expansion Network (PENet), a framework that constructs big-capacity prototypes from two complementary feature sources. PENet employs a dual-stream learner architecture: it retains a conventional fully supervised Intrinsic Learner (IL) to distill representative features, while introducing a novel Diffusion Learner (DL) to provide rich generalizable features. The resulting dual prototypes are then processed by a Prototype Assimilation Module (PAM), which adopts a novel push-pull cross-guidance attention block to iteratively align the prototypes with the query space. Furthermore, a Prototype Calibration Mechanism (PCM) regularizes the final big capacity prototype to prevent semantic drift. Extensive experiments on the S3DIS and ScanNet datasets demonstrate that PENet significantly outperforms state-of-the-art methods across various few-shot settings.

Related papers

Query-aware Hub Prototype Learning for Few-Shot 3D Point Cloud Semantic Segmentation [15.238614809926936]
Few-shot 3D point cloud semantic segmentation (FS-3DSeg) aims to segment novel classes with only a few labeled samples.<n>Existing metric-based prototype learning methods generate prototypes solely from the support set, without considering their relevance to query data.<n>We propose a novel Query-aware Hub Prototype (QHP) learning method that explicitly models semantic correlations between support and query sets.
arXiv Detail & Related papers (2025-12-09T05:18:30Z)
Divide, Conquer and Unite: Hierarchical Style-Recalibrated Prototype Alignment for Federated Medical Image Segmentation [66.82598255715696]
Federated learning enables multiple medical institutions to train a global model without sharing data.<n>Current approaches primarily focus on final-layer features, overlooking critical multi-level cues.<n>We propose FedBCS to bridge feature representation gaps via domain-invariant contextual prototypes alignment.
arXiv Detail & Related papers (2025-11-14T04:15:34Z)
DPL: Spatial-Conditioned Diffusion Prototype Enhancement for One-Shot Medical Segmentation [0.0]
One-shot medical image segmentation faces fundamental challenges in prototype representation due to limited annotated data and anatomical variability across patients.<n>Traditional prototype-based methods rely on deterministic averaging of support features, creating brittle representations that fail to capture intra-class diversity essential for robust generalization.<n>This work introduces Diffusion Prototype Learning, a novel framework that reformulates prototype construction through diffusion-based feature space exploration.
arXiv Detail & Related papers (2025-10-14T05:28:58Z)
Probabilistic Prototype Calibration of Vision-Language Models for Generalized Few-shot Semantic Segmentation [75.18058114915327]
Generalized Few-Shot Semanticnative (GFSS) aims to extend a segmentation model to novel classes with only a few annotated examples.<n>We propose FewCLIP, a probabilistic prototype calibration framework over multi-modal prototypes from the pretrained CLIP.<n>We show FewCLIP significantly outperforms state-of-the-art approaches across both GFSS and class-incremental setting.
arXiv Detail & Related papers (2025-06-28T18:36:22Z)
Proto-FG3D: Prototype-based Interpretable Fine-Grained 3D Shape Classification [59.68055837500357]
We propose the first prototype-based framework named Proto-FG3D for fine-grained 3D shape classification.<n>Proto-FG3D establishes joint multi-view and multi-category representation learning via Prototype Association.<n>Proto-FG3D surpasses state-of-the-art methods in accuracy, transparent predictions, and ad-hoc interpretability with visualizations.
arXiv Detail & Related papers (2025-05-23T09:31:02Z)
A Deep Positive-Negative Prototype Approach to Integrated Prototypical Discriminative Learning [0.30693357740321775]
This paper proposes a novel Deep Positive-Negative Prototype (DPNP) model that combines prototype-based learning (PbL) with discriminative methods to improve class compactness and separability in deep neural networks.<n>We show that DPNP can organize prototypes in nearly regular positions within feature space, such that it is possible to achieve competitive classification accuracy even in much lower-dimensional feature spaces.
arXiv Detail & Related papers (2025-01-05T08:24:31Z)
Query-guided Prototype Evolution Network for Few-Shot Segmentation [85.75516116674771]
We present a new method that integrates query features into the generation process of foreground and background prototypes.<n> Experimental results on the PASCAL-$5i$ and mirroring-$20i$ datasets attest to the substantial enhancements achieved by QPENet.
arXiv Detail & Related papers (2024-03-11T07:50:40Z)
Multimodal Prototype-Enhanced Network for Few-Shot Action Recognition [40.329190454146996]
MultimOdal PRototype-ENhanced Network (MORN) uses semantic information of label texts as multimodal information to enhance prototypes. We conduct extensive experiments on four popular few-shot action recognition datasets.
arXiv Detail & Related papers (2022-12-09T14:24:39Z)
Few-Shot Segmentation via Rich Prototype Generation and Recurrent Prediction Enhancement [12.614578133091168]
We propose a rich prototype generation module (RPGM) and a recurrent prediction enhancement module (RPEM) to reinforce the prototype learning paradigm. RPGM combines superpixel and K-means clustering to generate rich prototype features with complementary scale relationships. RPEM utilizes the recurrent mechanism to design a round-way propagation decoder.
arXiv Detail & Related papers (2022-10-03T08:46:52Z)
Dual Prototypical Contrastive Learning for Few-shot Semantic Segmentation [55.339405417090084]
We propose a dual prototypical contrastive learning approach tailored to the few-shot semantic segmentation (FSS) task. The main idea is to encourage the prototypes more discriminative by increasing inter-class distance while reducing intra-class distance in prototype feature space. We demonstrate that the proposed dual contrastive learning approach outperforms state-of-the-art FSS methods on PASCAL-5i and COCO-20i datasets.
arXiv Detail & Related papers (2021-11-09T08:14:50Z)
Prototype Completion for Few-Shot Learning [13.63424509914303]
Few-shot learning aims to recognize novel classes with few examples. Pre-training based methods effectively tackle the problem by pre-training a feature extractor and then fine-tuning it through the nearest centroid based meta-learning. We propose a novel prototype completion based meta-learning framework.
arXiv Detail & Related papers (2021-08-11T03:44:00Z)
End-to-end One-shot Human Parsing [91.5113227694443]
One-shot human parsing (OSHP) task requires parsing humans into an open set of classes defined by any test example. End-to-end One-shot human Parsing Network (EOP-Net) proposed. EOP-Net outperforms representative one-shot segmentation models by large margins.
arXiv Detail & Related papers (2021-05-04T01:35:50Z)
Part-aware Prototype Network for Few-shot Semantic Segmentation [50.581647306020095]
We propose a novel few-shot semantic segmentation framework based on the prototype representation. Our key idea is to decompose the holistic class representation into a set of part-aware prototypes. We develop a novel graph neural network model to generate and enhance the proposed part-aware prototypes.
arXiv Detail & Related papers (2020-07-13T11:03:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.