Disentangled Pre-training for Image Matting
- URL: http://arxiv.org/abs/2304.00784v2
- Date: Sun, 10 Dec 2023 12:13:56 GMT
- Title: Disentangled Pre-training for Image Matting
- Authors: Yanda Li, Zilong Huang, Gang Yu, Ling Chen, Yunchao Wei, Jianbo Jiao
- Abstract summary: Image matting requires high-quality pixel-level human annotations to support the training of a deep model.
We propose a self-supervised pre-training approach that can leverage infinite numbers of data to boost the matting performance.
- Score: 74.10407744483526
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image matting requires high-quality pixel-level human annotations to support
the training of a deep model in recent literature. Whereas such annotation is
costly and hard to scale, significantly holding back the development of the
research. In this work, we make the first attempt towards addressing this
problem, by proposing a self-supervised pre-training approach that can leverage
infinite numbers of data to boost the matting performance. The pre-training
task is designed in a similar manner as image matting, where random trimap and
alpha matte are generated to achieve an image disentanglement objective. The
pre-trained model is then used as an initialisation of the downstream matting
task for fine-tuning. Extensive experimental evaluations show that the proposed
approach outperforms both the state-of-the-art matting methods and other
alternative self-supervised initialisation approaches by a large margin. We
also show the robustness of the proposed approach over different backbone
architectures. Our project page is available at
https://crystraldo.github.io/dpt_mat/.
Related papers
- PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference [62.72779589895124]
We make the first attempt to align diffusion models for image inpainting with human aesthetic standards via a reinforcement learning framework.
We train a reward model with a dataset we construct, consisting of nearly 51,000 images annotated with human preferences.
Experiments on inpainting comparison and downstream tasks, such as image extension and 3D reconstruction, demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-10-29T11:49:39Z) - GS-Pose: Category-Level Object Pose Estimation via Geometric and
Semantic Correspondence [5.500735640045456]
Category-level pose estimation is a challenging task with many potential applications in computer vision and robotics.
We propose to utilize both geometric and semantic features obtained from a pre-trained foundation model.
This requires significantly less data to train than prior methods since the semantic features are robust to object texture and appearance.
arXiv Detail & Related papers (2023-11-23T02:35:38Z) - UMat: Uncertainty-Aware Single Image High Resolution Material Capture [2.416160525187799]
We propose a learning-based method to recover normals, specularity, and roughness from a single diffuse image of a material.
Our method is the first one to deal with the problem of modeling uncertainty in material digitization.
arXiv Detail & Related papers (2023-05-25T17:59:04Z) - Thread Counting in Plain Weave for Old Paintings Using Semi-Supervised
Regression Deep Learning Models [0.0]
The authors develop regression approaches based on deep learning to perform thread density estimation for plain weave canvas analysis.
The performance of our novel algorithm is analyzed with works by Ribera, Vel'azquez, and Poussin where we compare our results to the ones of previous approaches.
arXiv Detail & Related papers (2023-03-28T14:15:13Z) - Self-Distillation for Further Pre-training of Transformers [83.84227016847096]
We propose self-distillation as a regularization for a further pre-training stage.
We empirically validate the efficacy of self-distillation on a variety of benchmark datasets for image and text classification tasks.
arXiv Detail & Related papers (2022-09-30T02:25:12Z) - Semantic keypoint-based pose estimation from single RGB frames [64.80395521735463]
We present an approach to estimating the continuous 6-DoF pose of an object from a single RGB image.
The approach combines semantic keypoints predicted by a convolutional network (convnet) with a deformable shape model.
We show that our approach can accurately recover the 6-DoF object pose for both instance- and class-based scenarios.
arXiv Detail & Related papers (2022-04-12T15:03:51Z) - An Adaptive Framework for Learning Unsupervised Depth Completion [59.17364202590475]
We present a method to infer a dense depth map from a color image and associated sparse depth measurements.
We show that regularization and co-visibility are related via the fitness of the model to data and can be unified into a single framework.
arXiv Detail & Related papers (2021-06-06T02:27:55Z) - Combining Semantic Guidance and Deep Reinforcement Learning For
Generating Human Level Paintings [22.889059874754242]
Generation of stroke-based non-photorealistic imagery is an important problem in the computer vision community.
Previous methods have been limited to datasets with little variation in position, scale and saliency of the foreground object.
We propose a Semantic Guidance pipeline with 1) a bi-level painting procedure for learning the distinction between foreground and background brush strokes at training time.
arXiv Detail & Related papers (2020-11-25T09:00:04Z) - Distilling Localization for Self-Supervised Representation Learning [82.79808902674282]
Contrastive learning has revolutionized unsupervised representation learning.
Current contrastive models are ineffective at localizing the foreground object.
We propose a data-driven approach for learning in variance to backgrounds.
arXiv Detail & Related papers (2020-04-14T16:29:42Z) - AlphaNet: An Attention Guided Deep Network for Automatic Image Matting [0.0]
We propose an end to end solution for image matting i.e. high-precision extraction of foreground objects from natural images.
We propose a method that assimilates semantic segmentation and deep image matting processes into a single network to generate semantic mattes.
We also construct a fashion e-commerce focused dataset with high-quality alpha mattes to facilitate the training and evaluation for image matting.
arXiv Detail & Related papers (2020-03-07T17:25:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.