DM$^3$Net: Dual-Camera Super-Resolution via Domain Modulation and Multi-scale Matching
- URL: http://arxiv.org/abs/2506.06993v1
- Date: Sun, 08 Jun 2025 04:51:18 GMT
- Title: DM$^3$Net: Dual-Camera Super-Resolution via Domain Modulation and Multi-scale Matching
- Authors: Cong Guan, Jiacheng Ying, Yuya Ieiri, Osamu Yoshie,
- Abstract summary: DM$3$Net is a novel dual-camera super-resolution network based on Domain Modulation and Multi-scale Matching.<n>We introduce Key Pruning to achieve a significant reduction in memory usage and inference time with little model performance sacrificed.
- Score: 4.5275109094772485
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Dual-camera super-resolution is highly practical for smartphone photography that primarily super-resolve the wide-angle images using the telephoto image as a reference. In this paper, we propose DM$^3$Net, a novel dual-camera super-resolution network based on Domain Modulation and Multi-scale Matching. To bridge the domain gap between the high-resolution domain and the degraded domain, we learn two compressed global representations from image pairs corresponding to the two domains. To enable reliable transfer of high-frequency structural details from the reference image, we design a multi-scale matching module that conducts patch-level feature matching and retrieval across multiple receptive fields to improve matching accuracy and robustness. Moreover, we also introduce Key Pruning to achieve a significant reduction in memory usage and inference time with little model performance sacrificed. Experimental results on three real-world datasets demonstrate that our DM$^3$Net outperforms the state-of-the-art approaches.
Related papers
- INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model [71.50973774576431]
We propose a novel MLLM, INF-LLaVA, designed for effective high-resolution image perception.
We introduce a Dual-perspective Cropping Module (DCM), which ensures that each sub-image contains continuous details from a local perspective.
Second, we introduce Dual-perspective Enhancement Module (DEM) to enable the mutual enhancement of global and local features.
arXiv Detail & Related papers (2024-07-23T06:02:30Z) - Semantic Segmentation for Real-World and Synthetic Vehicle's Forward-Facing Camera Images [0.8562182926816566]
This is the solution for semantic segmentation problem in both real-world and synthetic images from a vehicle s forward-facing camera.
We concentrate in building a robust model which performs well across various domains of different outdoor situations.
This paper studies the effectiveness of employing real-world and synthetic data to handle the domain adaptation in semantic segmentation problem.
arXiv Detail & Related papers (2024-07-07T17:28:45Z) - Unified Frequency-Assisted Transformer Framework for Detecting and
Grounding Multi-Modal Manipulation [109.1912721224697]
We present the Unified Frequency-Assisted transFormer framework, named UFAFormer, to address the DGM4 problem.
By leveraging the discrete wavelet transform, we decompose images into several frequency sub-bands, capturing rich face forgery artifacts.
Our proposed frequency encoder, incorporating intra-band and inter-band self-attentions, explicitly aggregates forgery features within and across diverse sub-bands.
arXiv Detail & Related papers (2023-09-18T11:06:42Z) - Dual Adversarial Adaptation for Cross-Device Real-World Image
Super-Resolution [114.26933742226115]
Super-resolution (SR) models trained on images from different devices could exhibit distinct imaging patterns.
We propose an unsupervised domain adaptation mechanism for real-world SR, named Dual ADversarial Adaptation (DADA)
We empirically conduct experiments under six Real to Real adaptation settings among three different cameras, and achieve superior performance compared with existing state-of-the-art approaches.
arXiv Detail & Related papers (2022-05-07T02:55:39Z) - A Triple-Double Convolutional Neural Network for Panchromatic Sharpening [31.392337484731783]
Pansharpening refers to the fusion of a panchromatic image with a high spatial resolution and a multispectral image with a low spatial resolution.
In this paper, we propose a novel deep neural network architecture with level-domain based loss function for pansharpening.
arXiv Detail & Related papers (2021-12-04T04:22:11Z) - Multi-Spectral Multi-Image Super-Resolution of Sentinel-2 with
Radiometric Consistency Losses and Its Effect on Building Delineation [23.025397327720874]
We present the first results of applying multi-image super-resolution (MISR) to multi-spectral remote sensing imagery.
We show that MISR is superior to single-image super-resolution and other baselines on a range of image fidelity metrics.
arXiv Detail & Related papers (2021-11-05T02:49:04Z) - Dual-Camera Super-Resolution with Aligned Attention Modules [56.54073689003269]
We present a novel approach to reference-based super-resolution (RefSR) with the focus on dual-camera super-resolution (DCSR)
Our proposed method generalizes the standard patch-based feature matching with spatial alignment operations.
To bridge the domain gaps between real-world images and the training images, we propose a self-supervised domain adaptation strategy.
arXiv Detail & Related papers (2021-09-03T07:17:31Z) - Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task.
We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network.
Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z) - DDet: Dual-path Dynamic Enhancement Network for Real-World Image
Super-Resolution [69.2432352477966]
Real image super-resolution(Real-SR) focus on the relationship between real-world high-resolution(HR) and low-resolution(LR) image.
In this article, we propose a Dual-path Dynamic Enhancement Network(DDet) for Real-SR.
Unlike conventional methods which stack up massive convolutional blocks for feature representation, we introduce a content-aware framework to study non-inherently aligned image pair.
arXiv Detail & Related papers (2020-02-25T18:24:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.