BEN: Using Confidence-Guided Matting for Dichotomous Image Segmentation
- URL: http://arxiv.org/abs/2501.06230v2
- Date: Sat, 01 Nov 2025 15:04:27 GMT
- Title: BEN: Using Confidence-Guided Matting for Dichotomous Image Segmentation
- Authors: Maxwell Meyer, Jack Spruyt,
- Abstract summary: We propose a new architectural approach for image segmentation called Confidence-Guided Matting (CGM)<n>BEN consists of two components: BEN Base for initial segmentation and BEN Refiner for confidence-based refinement.<n>This work introduces a new paradigm for integrating matting and segmentation techniques, improving fine-grained object boundary prediction in computer vision.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Current approaches to dichotomous image segmentation (DIS) treat image matting and object segmentation as fundamentally different tasks. As improvements in image segmentation become increasingly challenging to achieve, combining image matting and grayscale segmentation techniques offers promising new directions for architectural innovation. Inspired by the possibility of aligning these two model tasks, we propose a new architectural approach for DIS called Confidence-Guided Matting (CGM). We created the first CGM model called Background Erase Network (BEN). BEN consists of two components: BEN Base for initial segmentation and BEN Refiner for confidence-based refinement. Our approach achieves substantial improvements over current state-of-the-art methods on the DIS5K validation dataset, demonstrating that matting-based refinement can significantly enhance segmentation quality. This work introduces a new paradigm for integrating matting and segmentation techniques, improving fine-grained object boundary prediction in computer vision.
Related papers
- Seg-VAR: Image Segmentation with Visual Autoregressive Modeling [60.79579744943664]
We propose a novel framework that rethinks segmentation as a conditional autoregressive mask generation problem.<n>This is achieved by replacing the discriminative learning with the latent learning process.<n>Our method incorporates three core components: (1) an image encoder generating latent priors from input images, (2) a spatial-aware seglat (a latent expression of segmentation mask) encoder that maps segmentation masks into discrete latent tokens, and (3) a decoder reconstructing masks from these latents.
arXiv Detail & Related papers (2025-11-16T13:36:19Z) - Bridging the Inter-Domain Gap through Low-Level Features for Cross-Modal Medical Image Segmentation [8.582475563483465]
This paper addresses the task of cross-modal medical image segmentation by exploring unsupervised domain adaptation (UDA) approaches.<n>We propose a model-agnostic UDA framework, LowBridge, which builds on a simple observation that cross-modal images share some similar low-level features (e.g., edges) as they are depicting the same structures.<n>At test time, edge features from the target images are input to the pretrained generative model to generate source-style target domain images, which are then segmented using the pretrained segmentation network.
arXiv Detail & Related papers (2025-05-17T08:49:19Z) - DINOv2-powered Few-Shot Semantic Segmentation: A Unified Framework via Cross-Model Distillation and 4D Correlation Mining [30.564216896513596]
Few-shot semantic segmentation has gained increasing interest due to its generalization capability.
Recent approaches have turned to foundation models to enhance representation transferability.
We propose FS-DINO, with only DINOv2's encoder and a lightweight segmenter.
arXiv Detail & Related papers (2025-04-22T07:47:06Z) - A Deep Learning Framework for Boundary-Aware Semantic Segmentation [9.680285420002516]
This study proposes a Mask2Former-based semantic segmentation algorithm incorporating a boundary enhancement feature bridging module (BEFBM)<n>The proposed approach achieves significant improvements in metrics such as mIOU, mDICE, and mRecall.<n>Visual analysis confirms the model's advantages in fine-grained regions.
arXiv Detail & Related papers (2025-03-28T00:00:08Z) - Image Segmentation in Foundation Model Era: A Survey [95.60054312319939]
Current research in image segmentation lacks a detailed analysis of distinct characteristics, challenges, and solutions.<n>This survey seeks to fill this gap by providing a thorough review of cutting-edge research centered around FM-driven image segmentation.<n>An exhaustive overview of over 300 segmentation approaches is provided to encapsulate the breadth of current research efforts.
arXiv Detail & Related papers (2024-08-23T10:07:59Z) - Explore In-Context Segmentation via Latent Diffusion Models [132.26274147026854]
In-context segmentation aims to segment objects using given reference images.
Most existing approaches adopt metric learning or masked image modeling to build the correlation between visual prompts and input image queries.
This work approaches the problem from a fresh perspective - unlocking the capability of the latent diffusion model for in-context segmentation.
arXiv Detail & Related papers (2024-03-14T17:52:31Z) - Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis.
We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z) - Improving Pixel-based MIM by Reducing Wasted Modeling Capability [77.99468514275185]
We propose a new method that explicitly utilizes low-level features from shallow layers to aid pixel reconstruction.
To the best of our knowledge, we are the first to systematically investigate multi-level feature fusion for isotropic architectures.
Our method yields significant performance gains, such as 1.2% on fine-tuning, 2.8% on linear probing, and 2.6% on semantic segmentation.
arXiv Detail & Related papers (2023-08-01T03:44:56Z) - CM-MaskSD: Cross-Modality Masked Self-Distillation for Referring Image
Segmentation [29.885991324519463]
We propose a novel cross-modality masked self-distillation framework named CM-MaskSD.
Our method inherits the transferred knowledge of image-text semantic alignment from CLIP model to realize fine-grained patch-word feature alignment.
Our framework can considerably boost model performance in a nearly parameter-free manner.
arXiv Detail & Related papers (2023-05-19T07:17:27Z) - Revisiting Image Reconstruction for Semi-supervised Semantic
Segmentation [16.27277238968567]
We revisit the idea of using image reconstruction as an auxiliary task and incorporate it with a modern semi-supervised semantic segmentation framework.
Surprisingly, we discover that such an old idea in semi-supervised learning can produce results competitive with state-of-the-art semantic segmentation algorithms.
arXiv Detail & Related papers (2023-03-17T06:31:06Z) - CoMFormer: Continual Learning in Semantic and Panoptic Segmentation [45.66711231393775]
We present the first continual learning model capable of operating on both semantic and panoptic segmentation.
Our method carefully exploits the properties of transformer architectures to learn new classes over time.
Our CoMFormer outperforms all the existing baselines by forgetting less old classes but also learning more effectively new classes.
arXiv Detail & Related papers (2022-11-25T10:15:06Z) - Progressively Dual Prior Guided Few-shot Semantic Segmentation [57.37506990980975]
Few-shot semantic segmentation task aims at performing segmentation in query images with a few annotated support samples.
We propose a progressively dual prior guided few-shot semantic segmentation network.
arXiv Detail & Related papers (2022-11-20T16:19:47Z) - BoundarySqueeze: Image Segmentation as Boundary Squeezing [104.43159799559464]
We propose a novel method for fine-grained high-quality image segmentation of both objects and scenes.
Inspired by dilation and erosion from morphological image processing techniques, we treat the pixel level segmentation problems as squeezing object boundary.
Our method yields large gains on COCO, Cityscapes, for both instance and semantic segmentation and outperforms previous state-of-the-art PointRend in both accuracy and speed under the same setting.
arXiv Detail & Related papers (2021-05-25T04:58:51Z) - Improving Semantic Segmentation via Decoupled Body and Edge Supervision [89.57847958016981]
Existing semantic segmentation approaches either aim to improve the object's inner consistency by modeling the global context, or refine objects detail along their boundaries by multi-scale feature fusion.
In this paper, a new paradigm for semantic segmentation is proposed.
Our insight is that appealing performance of semantic segmentation requires textitexplicitly modeling the object textitbody and textitedge, which correspond to the high and low frequency of the image.
We show that the proposed framework with various baselines or backbone networks leads to better object inner consistency and object boundaries.
arXiv Detail & Related papers (2020-07-20T12:11:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.