Superpixel Anything: A general object-based framework for accurate yet regular superpixel segmentation
- URL: http://arxiv.org/abs/2509.12791v1
- Date: Tue, 16 Sep 2025 08:09:24 GMT
- Title: Superpixel Anything: A general object-based framework for accurate yet regular superpixel segmentation
- Authors: Julien Walther, Rémi Giraud, Michaël Clément,
- Abstract summary: SPAM (SuperPixel Anything Model) is a versatile framework for segmenting images into accurate yet regular superpixels.<n>We leverage a large-scale pretrained model for semantic-agnostic segmentation to ensure that superpixels align with object masks.<n> SPAM can handle any prior high-level segmentation, resolving uncertainty regions, and is able to interactively focus on specific objects.
- Score: 1.121518046252855
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Superpixels are widely used in computer vision to simplify image representation and reduce computational complexity. While traditional methods rely on low-level features, deep learning-based approaches leverage high-level features but also tend to sacrifice regularity of superpixels to capture complex objects, leading to accurate but less interpretable segmentations. In this work, we introduce SPAM (SuperPixel Anything Model), a versatile framework for segmenting images into accurate yet regular superpixels. We train a model to extract image features for superpixel generation, and at inference, we leverage a large-scale pretrained model for semantic-agnostic segmentation to ensure that superpixels align with object masks. SPAM can handle any prior high-level segmentation, resolving uncertainty regions, and is able to interactively focus on specific objects. Comprehensive experiments demonstrate that SPAM qualitatively and quantitatively outperforms state-of-the-art methods on segmentation tasks, making it a valuable and robust tool for various applications. Code and pre-trained models are available here: https://github.com/waldo-j/spam.
Related papers
- UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning [83.68366772745689]
We propose UniPixel, a large multi-modal model capable of flexibly comprehending visual prompt inputs and generating mask-grounded responses.<n>Specifically, UniPixel processes visual prompts and generates relevant masks on demand, and performs subsequent reasoning conditioning on these intermediate pointers during inference.<n>The effectiveness of our approach has been verified on 10 benchmarks across a diverse set of tasks, including pixel-level referring/segmentation and object-centric understanding in images/videos.
arXiv Detail & Related papers (2025-09-22T17:59:40Z) - X-SAM: From Segment Anything to Any Segmentation [63.79182974315084]
Large Language Models (LLMs) demonstrate strong capabilities in broad knowledge representation, yet they are inherently deficient in pixel-level perceptual understanding.<n>We present X-SAM, a streamlined Multimodal Large Language Model framework that extends the segmentation paradigm from textitsegment anything to textitany segmentation.<n>We propose a new segmentation task, termed Visual GrounDed (VGD) segmentation, which segments all instance objects with interactive visual prompts and empowers MLLMs with visual grounded, pixel-wise interpretative capabilities.
arXiv Detail & Related papers (2025-08-06T17:19:10Z) - Superpixel Segmentation: A Long-Lasting Ill-Posed Problem [1.104960878651584]
We show that superpixel segmentation is fundamentally an ill-posed problem, due to the implicit regularity constraint on the shape and size of superpixels.
We show that we can achieve competitive results using a recent architecture like the Segment Anything Model (SAM) without dedicated training for the superpixel segmentation task.
arXiv Detail & Related papers (2024-11-10T14:31:56Z) - PixelLM: Pixel Reasoning with Large Multimodal Model [110.500792765109]
PixelLM is an effective and efficient LMM for pixel-level reasoning and understanding.
It produces masks from the hidden embeddings of the codebook tokens, which encode detailed target-relevant information.
PixelLM excels across various pixel-level image reasoning and understanding tasks, outperforming well-established methods in multiple benchmarks.
arXiv Detail & Related papers (2023-12-04T03:05:59Z) - Superpixel Transformers for Efficient Semantic Segmentation [32.537400525407186]
We propose a solution by leveraging the idea of superpixels, an over-segmentation of the image, and applying them with a modern transformer framework.
Our method achieves state-of-the-art performance in semantic segmentation due to the rich superpixel features generated by the global self-attention mechanism.
arXiv Detail & Related papers (2023-09-28T23:09:30Z) - Efficient Multiscale Object-based Superpixel Framework [62.48475585798724]
We propose a novel superpixel framework, named Superpixels through Iterative CLEarcutting (SICLE)
SICLE exploits object information being able to generate a multiscale segmentation on-the-fly.
It generalizes recent superpixel methods, surpassing them and other state-of-the-art approaches in efficiency and effectiveness according to multiple delineation metrics.
arXiv Detail & Related papers (2022-04-07T15:59:38Z) - Saliency Enhancement using Superpixel Similarity [77.34726150561087]
Saliency Object Detection (SOD) has several applications in image analysis.
Deep-learning-based SOD methods are among the most effective, but they may miss foreground parts with similar colors.
We introduce a post-processing method, named textitSaliency Enhancement over Superpixel Similarity (SESS)
We demonstrate that SESS can consistently and considerably improve the results of three deep-learning-based SOD methods on five image datasets.
arXiv Detail & Related papers (2021-12-01T17:22:54Z) - HERS Superpixels: Deep Affinity Learning for Hierarchical Entropy Rate
Segmentation [0.0]
We propose a two-stage graph-based framework for superpixel segmentation.
In the first stage, we introduce an efficient Deep Affinity Learning network that learns pairwise pixel affinities.
In the second stage, we propose a highly efficient superpixel method called Hierarchical Entropy Rate (HERS)
arXiv Detail & Related papers (2021-06-07T16:20:04Z) - Implicit Integration of Superpixel Segmentation into Fully Convolutional
Networks [11.696069523681178]
We propose a way to implicitly integrate a superpixel scheme into CNNs.
Our proposed method hierarchically groups pixels at downsampling layers and generates superpixels.
We evaluate our method on several tasks such as semantic segmentation, superpixel segmentation, and monocular depth estimation.
arXiv Detail & Related papers (2021-03-05T02:20:26Z) - Superpixel Segmentation Based on Spatially Constrained Subspace
Clustering [57.76302397774641]
We consider each representative region with independent semantic information as a subspace, and formulate superpixel segmentation as a subspace clustering problem.
We show that a simple integration of superpixel segmentation with the conventional subspace clustering does not effectively work due to the spatial correlation of the pixels.
We propose a novel convex locality-constrained subspace clustering model that is able to constrain the spatial adjacent pixels with similar attributes to be clustered into a superpixel.
arXiv Detail & Related papers (2020-12-11T06:18:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.