Related papers: Beyond Image Borders: Learning Feature Extrapolation for Unbounded Image Composition

Beyond Image Borders: Learning Feature Extrapolation for Unbounded Image Composition

URL: http://arxiv.org/abs/2309.12042v1
Date: Thu, 21 Sep 2023 13:10:28 GMT
Title: Beyond Image Borders: Learning Feature Extrapolation for Unbounded Image Composition
Authors: Xiaoyu Liu, Ming Liu, Junyi Li, Shuai Liu, Xiaotao Wang, Lei Lei, Wangmeng Zuo
Abstract summary: We present a joint framework for both recommendation of camera view and image composition (i.e., UNIC) Specifically, our framework takes the current camera preview frame as input and provides a recommendation for view adjustment. Our method converges and results in both a camera view and a bounding box showing the image composition recommendation.
Score: 80.14697389188143
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: For improving image composition and aesthetic quality, most existing methods modulate the captured images by striking out redundant content near the image borders. However, such image cropping methods are limited in the range of image views. Some methods have been suggested to extrapolate the images and predict cropping boxes from the extrapolated image. Nonetheless, the synthesized extrapolated regions may be included in the cropped image, making the image composition result not real and potentially with degraded image quality. In this paper, we circumvent this issue by presenting a joint framework for both unbounded recommendation of camera view and image composition (i.e., UNIC). In this way, the cropped image is a sub-image of the image acquired by the predicted camera view, and thus can be guaranteed to be real and consistent in image quality. Specifically, our framework takes the current camera preview frame as input and provides a recommendation for view adjustment, which contains operations unlimited by the image borders, such as zooming in or out and camera movement. To improve the prediction accuracy of view adjustment prediction, we further extend the field of view by feature extrapolation. After one or several times of view adjustments, our method converges and results in both a camera view and a bounding box showing the image composition recommendation. Extensive experiments are conducted on the datasets constructed upon existing image cropping datasets, showing the effectiveness of our UNIC in unbounded recommendation of camera view and image composition. The source code, dataset, and pretrained models is available at https://github.com/liuxiaoyu1104/UNIC.

Related papers

Generative Panoramic Image Stitching [10.512280991285893]
We introduce the task of generative panoramic image stitching, which aims to synthesize seamless panoramas.<n>Traditional image stitching pipelines fail when tasked with synthesizing large, coherent regions of a panorama.<n>We propose a method that fine-tunes a diffusion-based inpainting model to preserve a scene's content and layout based on multiple reference images.
arXiv Detail & Related papers (2025-07-08T22:07:12Z)
FreeCompose: Generic Zero-Shot Image Composition with Diffusion Prior [50.0535198082903]
We offer a novel approach to image composition, which integrates multiple input images into a single, coherent image. We showcase the potential of utilizing the powerful generative prior inherent in large-scale pre-trained diffusion models to accomplish generic image composition.
arXiv Detail & Related papers (2024-07-06T03:35:43Z)
Image Cropping under Design Constraints [19.364718428893923]
In display media, image cropping is often required to satisfy various constraints, such as an aspect ratio and blank regions for placing texts or objects. We propose a score function-based approach, which computes scores for cropped results whether aesthetically plausible and satisfies design constraints. In experiments, we demonstrate that the proposed approaches outperform a baseline, and we observe that the proposal-based approach is better than the heatmap-based approach under the same computation cost.
arXiv Detail & Related papers (2023-10-13T06:53:28Z)
iEdit: Localised Text-guided Image Editing with Weak Supervision [53.082196061014734]
We propose a novel learning method for text-guided image editing. It generates images conditioned on a source image and a textual edit prompt. It shows favourable results against its counterparts in terms of image fidelity, CLIP alignment score and qualitatively for editing both generated and real images.
arXiv Detail & Related papers (2023-05-10T07:39:14Z)
ClipCrop: Conditioned Cropping Driven by Vision-Language Model [90.95403416150724]
We take advantage of vision-language models as a foundation for creating robust and user-intentional cropping algorithms. We develop a method to perform cropping with a text or image query that reflects the user's intention as guidance. Our pipeline design allows the model to learn text-conditioned aesthetic cropping with a small dataset.
arXiv Detail & Related papers (2022-11-21T14:27:07Z)
im2nerf: Image to Neural Radiance Field in the Wild [47.18702901448768]
im2nerf is a learning framework that predicts a continuous neural object representation given a single input image in the wild. We show that im2nerf achieves the state-of-the-art performance for novel view synthesis from a single-view unposed image in the wild.
arXiv Detail & Related papers (2022-09-08T23:28:56Z)
Guided Co-Modulated GAN for 360{\deg} Field of View Extrapolation [15.850166450573756]
We propose a method to extrapolate a 360deg field of view from a single image. Our method obtains state-of-the-art results and outperforms previous methods on standard image quality metrics.
arXiv Detail & Related papers (2022-04-15T01:48:35Z)
SSH: A Self-Supervised Framework for Image Harmonization [97.16345684998788]
We propose a novel Self-Supervised Harmonization framework (SSH) that can be trained using just "free" natural images without being edited. Our results show that the proposedSSH outperforms previous state-of-the-art methods in terms of reference metrics, visual quality, and subject user study.
arXiv Detail & Related papers (2021-08-15T19:51:33Z)
Camera View Adjustment Prediction for Improving Image Composition [14.541539156817045]
We propose a deep learning-based approach that provides suggestions to the photographer on how to adjust the camera view before capturing. By optimizing the composition before a photo is captured, our system helps photographers to capture better photos.
arXiv Detail & Related papers (2021-04-15T17:18:31Z)
Bridging the Visual Gap: Wide-Range Image Blending [16.464837892640812]
We introduce an effective deep-learning model to realize wide-range image blending. We experimentally demonstrate that our proposed method is able to produce visually appealing results.
arXiv Detail & Related papers (2021-03-28T15:07:45Z)
Deep Photo Cropper and Enhancer [65.11910918427296]
We propose a new type of image enhancement problem: to crop an image which is embedded within a photo. We split our proposed approach into two deep networks: deep photo cropper and deep image enhancer. In the photo cropper network, we employ a spatial transformer to extract the embedded image. In the photo enhancer, we employ super-resolution to increase the number of pixels in the embedded image.
arXiv Detail & Related papers (2020-08-03T03:50:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.