Related papers: Discriminative feature encoding for intrinsic image decomposition

Discriminative feature encoding for intrinsic image decomposition

URL: http://arxiv.org/abs/2209.12155v1
Date: Sun, 25 Sep 2022 05:51:49 GMT
Title: Discriminative feature encoding for intrinsic image decomposition
Authors: Zongji Wang, Yunfei Liu, and Feng Lu
Abstract summary: Intrinsic image decomposition is an important and long-standing computer vision problem. This work takes advantage of deep learning, and shows that it can solve this challenging computer vision problem with high efficiency.
Score: 16.77439691640257
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Intrinsic image decomposition is an important and long-standing computer vision problem. Given an input image, recovering the physical scene properties is ill-posed. Several physically motivated priors have been used to restrict the solution space of the optimization problem for intrinsic image decomposition. This work takes advantage of deep learning, and shows that it can solve this challenging computer vision problem with high efficiency. The focus lies in the feature encoding phase to extract discriminative features for different intrinsic layers from an input image. To achieve this goal, we explore the distinctive characteristics of different intrinsic components in the high dimensional feature embedding space. We define feature distribution divergence to efficiently separate the feature vectors of different intrinsic components. The feature distributions are also constrained to fit the real ones through a feature distribution consistency. In addition, a data refinement approach is provided to remove data inconsistency from the Sintel dataset, making it more suitable for intrinsic image decomposition. Our method is also extended to intrinsic video decomposition based on pixel-wise correspondences between adjacent frames. Experimental results indicate that our proposed network structure can outperform the existing state-of-the-art.

Related papers

Visible and Infrared Image Fusion Using Encoder-Decoder Network [0.0]
We present a novel learning-based solution to image fusion problem focusing on infrared and visible spectrum images. The proposed solution utilizes only convolution and pooling layers together with a loss function using no-reference quality metrics.
arXiv Detail & Related papers (2024-12-11T03:42:31Z)
Toward Scalable Image Feature Compression: A Content-Adaptive and Diffusion-Based Approach [44.03561901593423]
This paper introduces a content-adaptive diffusion model for scalable image compression. The proposed method encodes fine textures through a diffusion process, enhancing perceptual quality. Experiments demonstrate the effectiveness of the proposed framework in both image reconstruction and downstream machine vision tasks.
arXiv Detail & Related papers (2024-10-08T15:48:34Z)
Multi-Feature Aggregation in Diffusion Models for Enhanced Face Super-Resolution [6.055006354743854]
We develop an algorithm that utilize a low-resolution image combined with features extracted from multiple low-quality images to generate a super-resolved image. Unlike other algorithms, our approach recovers facial features without explicitly providing attribute information. This is the first time multi-features combined with low-resolution images are used as conditioners to generate more reliable super-resolution images.
arXiv Detail & Related papers (2024-08-27T20:08:33Z)
Restoring Images in Adverse Weather Conditions via Histogram Transformer [75.74328579778049]
We propose an efficient Histogram Transformer (Histoformer) for restoring images affected by adverse weather. It is powered by a mechanism dubbed histogram self-attention, which sorts and segments spatial features into intensity-based bins. To boost histogram self-attention, we present a dynamic-range convolution enabling conventional convolution to conduct operation over similar pixels.
arXiv Detail & Related papers (2024-07-14T11:59:22Z)
Pixel-Inconsistency Modeling for Image Manipulation Localization [59.968362815126326]
Digital image forensics plays a crucial role in image authentication and manipulation localization. This paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts. Experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints.
arXiv Detail & Related papers (2023-09-30T02:54:51Z)
DBAT: Dynamic Backward Attention Transformer for Material Segmentation with Cross-Resolution Patches [8.812837829361923]
We propose the Dynamic Backward Attention Transformer (DBAT) to aggregate cross-resolution features. Experiments show that our DBAT achieves an accuracy of 86.85%, which is the best performance among state-of-the-art real-time models. We further align features to semantic labels, performing network dissection, to infer that the proposed model can extract material-related features better than other methods.
arXiv Detail & Related papers (2023-05-06T03:47:20Z)
Feature Completion Transformer for Occluded Person Re-identification [25.159974510754992]
Occluded person re-identification (Re-ID) is a challenging problem due to the destruction of occluders. We propose a Feature Completion Transformer (FCFormer) to implicitly complement the semantic information of occluded parts in the feature space. FCFormer achieves superior performance and outperforms the state-of-the-art methods by significant margins on occluded datasets.
arXiv Detail & Related papers (2023-03-03T01:12:57Z)
CoCoNet: Coupled Contrastive Learning Network with Multi-level Feature Ensemble for Multi-modality Image Fusion [72.8898811120795]
We propose a coupled contrastive learning network, dubbed CoCoNet, to realize infrared and visible image fusion. Our method achieves state-of-the-art (SOTA) performance under both subjective and objective evaluation.
arXiv Detail & Related papers (2022-11-20T12:02:07Z)
Learning Enriched Features for Fast Image Restoration and Enhancement [166.17296369600774]
This paper presents a holistic goal of maintaining spatially-precise high-resolution representations through the entire network. We learn an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details. Our approach achieves state-of-the-art results for a variety of image processing tasks, including defocus deblurring, image denoising, super-resolution, and image enhancement.
arXiv Detail & Related papers (2022-04-19T17:59:45Z)
Spatially-Adaptive Image Restoration using Distortion-Guided Networks [51.89245800461537]
We present a learning-based solution for restoring images suffering from spatially-varying degradations. We propose SPAIR, a network design that harnesses distortion-localization information and dynamically adjusts to difficult regions in the image.
arXiv Detail & Related papers (2021-08-19T11:02:25Z)
Gated Fusion Network for Degraded Image Super Resolution [78.67168802945069]
We propose a dual-branch convolutional neural network to extract base features and recovered features separately. By decomposing the feature extraction step into two task-independent streams, the dual-branch model can facilitate the training process.
arXiv Detail & Related papers (2020-03-02T13:28:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.