Matte Anything: Interactive Natural Image Matting with Segment Anything
Models
- URL: http://arxiv.org/abs/2306.04121v2
- Date: Wed, 28 Feb 2024 07:36:17 GMT
- Title: Matte Anything: Interactive Natural Image Matting with Segment Anything
Models
- Authors: Jingfeng Yao, Xinggang Wang, Lang Ye, and Wenyu Liu
- Abstract summary: Matte Anything (MatAny) is an interactive natural image matting model that could produce high-quality alpha-matte.
We leverage vision foundation models to enhance the performance of natural image matting.
MatAny has 58.3% improvement on MSE and 40.6% improvement on SAD compared to the previous image matting methods.
- Score: 35.105593013654
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Natural image matting algorithms aim to predict the transparency map
(alpha-matte) with the trimap guidance. However, the production of trimap often
requires significant labor, which limits the widespread application of matting
algorithms on a large scale. To address the issue, we propose Matte Anything
(MatAny), an interactive natural image matting model that could produce
high-quality alpha-matte with various simple hints. The key insight of MatAny
is to generate pseudo trimap automatically with contour and transparency
prediction. In our work, we leverage vision foundation models to enhance the
performance of natural image matting. Specifically, we use the segment anything
model to predict high-quality contour with user interaction and an
open-vocabulary detector to predict the transparency of any object.
Subsequently, a pre-trained image matting model generates alpha mattes with
pseudo trimaps. MatAny is the interactive matting algorithm with the most
supported interaction methods and the best performance to date. It consists of
orthogonal vision models without any additional training. We evaluate the
performance of MatAny against several current image matting algorithms. MatAny
has 58.3% improvement on MSE and 40.6% improvement on SAD compared to the
previous image matting methods with simple guidance, achieving new
state-of-the-art (SOTA) performance. The source codes and pre-trained models
are available at https://github.com/hustvl/Matte-Anything.
Related papers
- Towards Natural Image Matting in the Wild via Real-Scenario Prior [69.96414467916863]
We propose a new matting dataset based on the COCO dataset, namely COCO-Matting.
The built COCO-Matting comprises an extensive collection of 38,251 human instance-level alpha mattes in complex natural scenarios.
For network architecture, the proposed feature-aligned transformer learns to extract fine-grained edge and transparency features.
The proposed matte-aligned decoder aims to segment matting-specific objects and convert coarse masks into high-precision mattes.
arXiv Detail & Related papers (2024-10-09T06:43:19Z) - Learning Trimaps via Clicks for Image Matting [103.6578944248185]
We introduce Click2Trimap, an interactive model capable of predicting high-quality trimaps and alpha mattes with minimal user click inputs.
In the user study, Click2Trimap achieves high-quality trimap and matting predictions in just an average of 5 seconds per image.
arXiv Detail & Related papers (2024-03-30T12:10:34Z) - DiffusionMat: Alpha Matting as Sequential Refinement Learning [87.76572845943929]
DiffusionMat is an image matting framework that employs a diffusion model for the transition from coarse to refined alpha mattes.
A correction module adjusts the output at each denoising step, ensuring that the final result is consistent with the input image's structures.
We evaluate our model across several image matting benchmarks, and the results indicate that DiffusionMat consistently outperforms existing methods.
arXiv Detail & Related papers (2023-11-22T17:16:44Z) - One-Trimap Video Matting [47.95947397358026]
We propose One-Trimap Video Matting network (OTVM) that performs video matting robustly using only one user-annotated trimap.
A key of OTVM is the joint modeling of trimap propagation and alpha prediction.
We evaluate our model on two latest video matting benchmarks, Deep Video Matting and VideoMatting108, and outperform state-of-the-art by significant margins.
arXiv Detail & Related papers (2022-07-27T08:19:41Z) - PP-Matting: High-Accuracy Natural Image Matting [11.68134059283327]
PP-Matting is a trimap-free architecture that can achieve high-accuracy natural image matting.
Our method applies a high-resolution detail branch (HRDB) that extracts fine-grained details of the foreground.
Also, we propose a semantic context branch (SCB) that adopts a semantic segmentation subtask.
arXiv Detail & Related papers (2022-04-20T12:54:06Z) - Deep Automatic Natural Image Matting [82.56853587380168]
Automatic image matting (AIM) refers to estimating the soft foreground from an arbitrary natural image without any auxiliary input like trimap.
We propose a novel end-to-end matting network, which can predict a generalized trimap for any image of the above types as a unified semantic representation.
Our network trained on available composite matting datasets outperforms existing methods both objectively and subjectively.
arXiv Detail & Related papers (2021-07-15T10:29:01Z) - Semantic Image Matting [75.21022252141474]
We show how to obtain better alpha mattes by incorporating into our framework semantic classification of matting regions.
Specifically, we consider and learn 20 classes of matting patterns, and propose to extend the conventional trimap to semantic trimap.
Experiments on multiple benchmarks show that our method outperforms other methods and has achieved the most competitive state-of-the-art performance.
arXiv Detail & Related papers (2021-04-16T16:21:02Z) - Human Perception Modeling for Automatic Natural Image Matting [2.179313476241343]
Natural image matting aims to precisely separate foreground objects from background using alpha matte.
We propose an intuitively-designed trimap-free two-stage matting approach without additional annotations.
Our matting algorithm has competitive performance with current state-of-the-art methods in both trimap-free and trimap-needed aspects.
arXiv Detail & Related papers (2021-03-31T12:08:28Z) - Salient Image Matting [0.0]
We propose an image matting framework called Salient Image Matting to estimate the per-pixel opacity value of the most salient foreground in an image.
Our framework simultaneously deals with the challenge of learning a wide range of semantics and salient object types.
Our framework requires only a fraction of expensive matting data as compared to other automatic methods.
arXiv Detail & Related papers (2021-03-23T06:22:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.