Discriminative feature encoding for intrinsic image decomposition
- URL: http://arxiv.org/abs/2209.12155v1
- Date: Sun, 25 Sep 2022 05:51:49 GMT
- Title: Discriminative feature encoding for intrinsic image decomposition
- Authors: Zongji Wang, Yunfei Liu, and Feng Lu
- Abstract summary: Intrinsic image decomposition is an important and long-standing computer vision problem.
This work takes advantage of deep learning, and shows that it can solve this challenging computer vision problem with high efficiency.
- Score: 16.77439691640257
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Intrinsic image decomposition is an important and long-standing computer
vision problem. Given an input image, recovering the physical scene properties
is ill-posed. Several physically motivated priors have been used to restrict
the solution space of the optimization problem for intrinsic image
decomposition. This work takes advantage of deep learning, and shows that it
can solve this challenging computer vision problem with high efficiency. The
focus lies in the feature encoding phase to extract discriminative features for
different intrinsic layers from an input image. To achieve this goal, we
explore the distinctive characteristics of different intrinsic components in
the high dimensional feature embedding space. We define feature distribution
divergence to efficiently separate the feature vectors of different intrinsic
components. The feature distributions are also constrained to fit the real ones
through a feature distribution consistency. In addition, a data refinement
approach is provided to remove data inconsistency from the Sintel dataset,
making it more suitable for intrinsic image decomposition. Our method is also
extended to intrinsic video decomposition based on pixel-wise correspondences
between adjacent frames. Experimental results indicate that our proposed
network structure can outperform the existing state-of-the-art.
Related papers
- Toward Scalable Image Feature Compression: A Content-Adaptive and Diffusion-Based Approach [44.03561901593423]
This paper introduces a content-adaptive diffusion model for scalable image compression.
The proposed method encodes fine textures through a diffusion process, enhancing perceptual quality.
Experiments demonstrate the effectiveness of the proposed framework in both image reconstruction and downstream machine vision tasks.
arXiv Detail & Related papers (2024-10-08T15:48:34Z) - Multi-Feature Aggregation in Diffusion Models for Enhanced Face Super-Resolution [6.055006354743854]
We develop an algorithm that utilize a low-resolution image combined with features extracted from multiple low-quality images to generate a super-resolved image.
Unlike other algorithms, our approach recovers facial features without explicitly providing attribute information.
This is the first time multi-features combined with low-resolution images are used as conditioners to generate more reliable super-resolution images.
arXiv Detail & Related papers (2024-08-27T20:08:33Z) - Restoring Images in Adverse Weather Conditions via Histogram Transformer [75.74328579778049]
We propose an efficient Histogram Transformer (Histoformer) for restoring images affected by adverse weather.
It is powered by a mechanism dubbed histogram self-attention, which sorts and segments spatial features into intensity-based bins.
To boost histogram self-attention, we present a dynamic-range convolution enabling conventional convolution to conduct operation over similar pixels.
arXiv Detail & Related papers (2024-07-14T11:59:22Z) - Pixel-Inconsistency Modeling for Image Manipulation Localization [59.968362815126326]
Digital image forensics plays a crucial role in image authentication and manipulation localization.
This paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts.
Experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints.
arXiv Detail & Related papers (2023-09-30T02:54:51Z) - DBAT: Dynamic Backward Attention Transformer for Material Segmentation
with Cross-Resolution Patches [8.812837829361923]
We propose the Dynamic Backward Attention Transformer (DBAT) to aggregate cross-resolution features.
Experiments show that our DBAT achieves an accuracy of 86.85%, which is the best performance among state-of-the-art real-time models.
We further align features to semantic labels, performing network dissection, to infer that the proposed model can extract material-related features better than other methods.
arXiv Detail & Related papers (2023-05-06T03:47:20Z) - Feature Completion Transformer for Occluded Person Re-identification [25.159974510754992]
Occluded person re-identification (Re-ID) is a challenging problem due to the destruction of occluders.
We propose a Feature Completion Transformer (FCFormer) to implicitly complement the semantic information of occluded parts in the feature space.
FCFormer achieves superior performance and outperforms the state-of-the-art methods by significant margins on occluded datasets.
arXiv Detail & Related papers (2023-03-03T01:12:57Z) - CoCoNet: Coupled Contrastive Learning Network with Multi-level Feature
Ensemble for Multi-modality Image Fusion [72.8898811120795]
We propose a coupled contrastive learning network, dubbed CoCoNet, to realize infrared and visible image fusion.
Our method achieves state-of-the-art (SOTA) performance under both subjective and objective evaluation.
arXiv Detail & Related papers (2022-11-20T12:02:07Z) - Learning Enriched Features for Fast Image Restoration and Enhancement [166.17296369600774]
This paper presents a holistic goal of maintaining spatially-precise high-resolution representations through the entire network.
We learn an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
Our approach achieves state-of-the-art results for a variety of image processing tasks, including defocus deblurring, image denoising, super-resolution, and image enhancement.
arXiv Detail & Related papers (2022-04-19T17:59:45Z) - Spatially-Adaptive Image Restoration using Distortion-Guided Networks [51.89245800461537]
We present a learning-based solution for restoring images suffering from spatially-varying degradations.
We propose SPAIR, a network design that harnesses distortion-localization information and dynamically adjusts to difficult regions in the image.
arXiv Detail & Related papers (2021-08-19T11:02:25Z) - Gated Fusion Network for Degraded Image Super Resolution [78.67168802945069]
We propose a dual-branch convolutional neural network to extract base features and recovered features separately.
By decomposing the feature extraction step into two task-independent streams, the dual-branch model can facilitate the training process.
arXiv Detail & Related papers (2020-03-02T13:28:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.