Image Demoiréing Using Dual Camera Fusion on Mobile Phones
- URL: http://arxiv.org/abs/2506.08361v1
- Date: Tue, 10 Jun 2025 02:20:37 GMT
- Title: Image Demoiréing Using Dual Camera Fusion on Mobile Phones
- Authors: Yanting Mei, Zhilu Zhang, Xiaohe Wu, Wangmeng Zuo,
- Abstract summary: We propose to utilize Dual Camera fusion for Image Demoir'eing (DCID), ie, using the ultra-wide-angle (UW) image to assist the moir'e removal of wide-angle (W) image.<n>In particular, we propose an efficient DCID method, where a lightweight UW image encoder is integrated into an existing demoir'eing network.
- Score: 58.39212652291496
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: When shooting electronic screens, moir\'e patterns usually appear in captured images, which seriously affects the image quality. Existing image demoir\'eing methods face great challenges in removing large and heavy moir\'e. To address the issue, we propose to utilize Dual Camera fusion for Image Demoir\'eing (DCID), \ie, using the ultra-wide-angle (UW) image to assist the moir\'e removal of wide-angle (W) image. This is inspired by two motivations: (1) the two lenses are commonly equipped with modern smartphones, (2) the UW image generally can provide normal colors and textures when moir\'e exists in the W image mainly due to their different focal lengths. In particular, we propose an efficient DCID method, where a lightweight UW image encoder is integrated into an existing demoir\'eing network and a fast two-stage image alignment manner is present. Moreover, we construct a large-scale real-world dataset with diverse mobile phones and monitors, containing about 9,000 samples. Experiments on the dataset show our method performs better than state-of-the-art methods. Code and dataset are available at https://github.com/Mrduckk/DCID.
Related papers
- Self-Supervised Learning for Real-World Super-Resolution from Dual and Multiple Zoomed Observations [61.448005005426666]
We consider two challenging issues in reference-based super-resolution (RefSR) for smartphone.
We propose a novel self-supervised learning approach for real-world RefSR from observations at dual and multiple camera zooms.
arXiv Detail & Related papers (2024-05-03T15:20:30Z) - xT: Nested Tokenization for Larger Context in Large Images [79.37673340393475]
xT is a framework for vision transformers which aggregates global context with local details.
We are able to increase accuracy by up to 8.6% on challenging classification tasks.
arXiv Detail & Related papers (2024-03-04T10:29:58Z) - Exposure Bracketing Is All You Need For A High-Quality Image [50.822601495422916]
Multi-exposure images are complementary in denoising, deblurring, high dynamic range imaging, and super-resolution.<n>We propose to utilize exposure bracketing photography to get a high-quality image by combining these tasks in this work.<n>In particular, a temporally modulated recurrent network (TMRNet) and self-supervised adaptation method are proposed.
arXiv Detail & Related papers (2024-01-01T14:14:35Z) - Multimodal contrastive learning for remote sensing tasks [0.5801044612920815]
We propose a dual-encoder framework, which is pre-trained on a large unlabeled dataset (1M) of Sentinel-1 and Sentinel-2 image pairs.
We test the embeddings on two remote sensing downstream tasks: flood segmentation and land cover mapping, and empirically show that embeddings learnt from this technique outperform the conventional technique of collecting positive examples via aggressive data augmentations.
arXiv Detail & Related papers (2022-09-06T09:31:45Z) - A Differentiable Two-stage Alignment Scheme for Burst Image
Reconstruction with Large Shift [13.454711511086261]
Joint denoising and demosaicking (JDD) for burst images, namely JDD-B, has attracted much attention.
One key challenge of JDD-B lies in the robust alignment of image frames.
We propose a differentiable two-stage alignment scheme sequentially in patch and pixel level for effective JDD-B.
arXiv Detail & Related papers (2022-03-17T12:55:45Z) - Uni4Eye: Unified 2D and 3D Self-supervised Pre-training via Masked Image
Modeling Transformer for Ophthalmic Image Classification [1.2250035750661867]
We propose a universal self-supervised Transformer framework, named Uni4Eye, to capture domain-specific feature embedding in ophthalmic images.
Uni4Eye can serve as a global feature extractor, which builds its basis on a Masked Image Modeling task with a Vision Transformer architecture.
We employ a Unified Patch Embedding module to replace the origin patch embedding module in ViT for jointly processing both 2D and 3D input images.
arXiv Detail & Related papers (2022-03-09T10:02:00Z) - Free-Form Image Inpainting via Contrastive Attention Network [64.05544199212831]
In image inpainting tasks, masks with any shapes can appear anywhere in images which form complex patterns.
It is difficult for encoders to capture such powerful representations under this complex situation.
We propose a self-supervised Siamese inference network to improve the robustness and generalization.
arXiv Detail & Related papers (2020-10-29T14:46:05Z) - Wavelet-Based Dual-Branch Network for Image Demoireing [148.91145614517015]
We design a wavelet-based dual-branch network (WDNet) with a spatial attention mechanism for image demoireing.
Our network removes moire patterns in the wavelet domain to separate the frequencies of moire patterns from the image content.
Experiments demonstrate the effectiveness of our method, and we further show that WDNet generalizes to removing moire artifacts on non-screen images.
arXiv Detail & Related papers (2020-07-14T16:44:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.