Boosting General Trimap-free Matting in the Real-World Image
- URL: http://arxiv.org/abs/2405.17916v1
- Date: Tue, 28 May 2024 07:37:44 GMT
- Title: Boosting General Trimap-free Matting in the Real-World Image
- Authors: Leo Shan Wenzhang Zhou Grace Zhao,
- Abstract summary: We propose a network called textbfMulti-textbfFeature fusion-based textbfCoarse-to-fine Network textbf(MFC-Net).
Our method is significantly effective on both synthetic and real-world images, and the performance in the real-world dataset is far better than existing matting-free methods.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image matting aims to obtain an alpha matte that separates foreground objects from the background accurately. Recently, trimap-free matting has been well studied because it requires only the original image without any extra input. Such methods usually extract a rough foreground by itself to take place trimap as further guidance. However, the definition of 'foreground' lacks a unified standard and thus ambiguities arise. Besides, the extracted foreground is sometimes incomplete due to inadequate network design. Most importantly, there is not a large-scale real-world matting dataset, and current trimap-free methods trained with synthetic images suffer from large domain shift problems in practice. In this paper, we define the salient object as foreground, which is consistent with human cognition and annotations of the current matting dataset. Meanwhile, data and technologies in salient object detection can be transferred to matting in a breeze. To obtain a more accurate and complete alpha matte, we propose a network called \textbf{M}ulti-\textbf{F}eature fusion-based \textbf{C}oarse-to-fine Network \textbf{(MFC-Net)}, which fully integrates multiple features for an accurate and complete alpha matte. Furthermore, we introduce image harmony in data composition to bridge the gap between synthetic and real images. More importantly, we establish the largest general matting dataset \textbf{(Real-19k)} in the real world to date. Experiments show that our method is significantly effective on both synthetic and real-world images, and the performance in the real-world dataset is far better than existing matting-free methods. Our code and data will be released soon.
Related papers
- MLI-NeRF: Multi-Light Intrinsic-Aware Neural Radiance Fields [21.057216351934688]
Current methods for extracting intrinsic image components, such as reflectance and shading, rely on statistical priors.
We propose MLI-NeRF, which integrates textbfMultiple textbfLight information in textbfIntrinsic-aware textbfNeural textbfRadiance textbfFields.
arXiv Detail & Related papers (2024-11-26T08:57:38Z) - Towards Natural Image Matting in the Wild via Real-Scenario Prior [69.96414467916863]
We propose a new matting dataset based on the COCO dataset, namely COCO-Matting.
The built COCO-Matting comprises an extensive collection of 38,251 human instance-level alpha mattes in complex natural scenarios.
For network architecture, the proposed feature-aligned transformer learns to extract fine-grained edge and transparency features.
The proposed matte-aligned decoder aims to segment matting-specific objects and convert coarse masks into high-precision mattes.
arXiv Detail & Related papers (2024-10-09T06:43:19Z) - Large-scale and Efficient Texture Mapping Algorithm via Loopy Belief
Propagation [4.742825811314168]
A texture mapping algorithm must be able to efficiently select views, fuse and map textures from these views to mesh models.
Existing approaches achieve efficiency either by limiting the number of images to one view per face, or simplifying global inferences to only achieve local color consistency.
This paper proposes a novel and efficient texture mapping framework that allows the use of multiple views of texture per face.
arXiv Detail & Related papers (2023-05-08T15:11:28Z) - From Synthetic to Real: Image Dehazing Collaborating with Unlabeled Real
Data [58.50411487497146]
We propose a novel image dehazing framework collaborating with unlabeled real data.
First, we develop a disentangled image dehazing network (DID-Net), which disentangles the feature representations into three component maps.
Then a disentangled-consistency mean-teacher network (DMT-Net) is employed to collaborate unlabeled real data for boosting single image dehazing.
arXiv Detail & Related papers (2021-08-06T04:00:28Z) - Deep Automatic Natural Image Matting [82.56853587380168]
Automatic image matting (AIM) refers to estimating the soft foreground from an arbitrary natural image without any auxiliary input like trimap.
We propose a novel end-to-end matting network, which can predict a generalized trimap for any image of the above types as a unified semantic representation.
Our network trained on available composite matting datasets outperforms existing methods both objectively and subjectively.
arXiv Detail & Related papers (2021-07-15T10:29:01Z) - Salient Image Matting [0.0]
We propose an image matting framework called Salient Image Matting to estimate the per-pixel opacity value of the most salient foreground in an image.
Our framework simultaneously deals with the challenge of learning a wide range of semantics and salient object types.
Our framework requires only a fraction of expensive matting data as compared to other automatic methods.
arXiv Detail & Related papers (2021-03-23T06:22:33Z) - Bridging Composite and Real: Towards End-to-end Deep Image Matting [88.79857806542006]
We study the roles of semantics and details for image matting.
We propose a novel Glance and Focus Matting network (GFM), which employs a shared encoder and two separate decoders.
Comprehensive empirical studies have demonstrated that GFM outperforms state-of-the-art methods.
arXiv Detail & Related papers (2020-10-30T10:57:13Z) - Improving Deep Stereo Network Generalization with Geometric Priors [93.09496073476275]
Large datasets of diverse real-world scenes with dense ground truth are difficult to obtain.
Many algorithms rely on small real-world datasets of similar scenes or synthetic datasets.
We propose to incorporate prior knowledge of scene geometry into an end-to-end stereo network to help networks generalize better.
arXiv Detail & Related papers (2020-08-25T15:24:02Z) - Boosting Semantic Human Matting with Coarse Annotations [66.8725980604434]
coarse annotated human dataset is much easier to acquire and collect from the public dataset.
A matting refinement network takes in the unified mask and the input image to predict the final alpha matte.
arXiv Detail & Related papers (2020-04-10T09:11:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.