HALO: Human-Aligned End-to-end Image Retargeting with Layered Transformations
- URL: http://arxiv.org/abs/2504.03026v1
- Date: Thu, 03 Apr 2025 20:53:19 GMT
- Title: HALO: Human-Aligned End-to-end Image Retargeting with Layered Transformations
- Authors: Yiran Xu, Siqi Xie, Zhuofang Li, Harris Shadmany, Yinxiao Li, Luciano Sbaiz, Miaosen Wang, Junjie Ke, Jose Lezama, Hang Qi, Han Zhang, Jesse Berent, Ming-Hsuan Yang, Irfan Essa, Jia-Bin Huang, Feng Yang,
- Abstract summary: Image aims to change the aspect-ratio of an image while maintaining its content and structure with less visual artifacts.<n>HALO decomposes the structure into salient/non-salient layers and applies different wrapping fields to different layers.<n>Our method achieves an 18.4% higher user preference compared to the baselines on average.
- Score: 55.78224865743332
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Image retargeting aims to change the aspect-ratio of an image while maintaining its content and structure with less visual artifacts. Existing methods still generate many artifacts or fail to maintain original content or structure. To address this, we introduce HALO, an end-to-end trainable solution for image retargeting. Since humans are more sensitive to distortions in salient areas than non-salient areas of an image, HALO decomposes the input image into salient/non-salient layers and applies different wrapping fields to different layers. To further minimize the structure distortion in the output images, we propose perceptual structure similarity loss which measures the structure similarity between input and output images and aligns with human perception. Both quantitative results and a user study on the RetargetMe dataset show that HALO achieves SOTA. Especially, our method achieves an 18.4% higher user preference compared to the baselines on average.
Related papers
- Decompositional Neural Scene Reconstruction with Generative Diffusion Prior [64.71091831762214]
Decompositional reconstruction of 3D scenes, with complete shapes and detailed texture, is intriguing for downstream applications.<n>Recent approaches incorporate semantic or geometric regularization to address this issue, but they suffer significant degradation in underconstrained areas.<n>We propose DP-Recon, which employs diffusion priors in the form of Score Distillation Sampling (SDS) to optimize the neural representation of each individual object under novel views.
arXiv Detail & Related papers (2025-03-19T02:11:31Z) - Towards Cross-View-Consistent Self-Supervised Surround Depth Estimation [9.569646683579899]
Self-Supervised Surround Depth Estimation from consecutive images offers an economical alternative.<n>Previous SSSDE methods have proposed different mechanisms to fuse information across images, but few of them explicitly consider the cross-view constraints.<n>This paper proposes an efficient and consistent pose estimation design and two loss functions to enhance cross-view consistency for SSSDE.
arXiv Detail & Related papers (2024-07-04T16:29:05Z) - OAIR: Object-Aware Image Retargeting Using PSO and Aesthetic Quality
Assessment [11.031841470875571]
Previous image methods create outputs that suffer from artifacts and distortions.
Simultaneous resizing of the foreground and background causes changes in the aspect ratios of the objects.
We propose a method that overcomes these problems.
arXiv Detail & Related papers (2022-09-11T07:16:59Z) - Controllable Person Image Synthesis with Spatially-Adaptive Warped
Normalization [72.65828901909708]
Controllable person image generation aims to produce realistic human images with desirable attributes.
We introduce a novel Spatially-Adaptive Warped Normalization (SAWN), which integrates a learned flow-field to warp modulation parameters.
We propose a novel self-training part replacement strategy to refine the pretrained model for the texture-transfer task.
arXiv Detail & Related papers (2021-05-31T07:07:44Z) - The Spatially-Correlative Loss for Various Image Translation Tasks [69.62228639870114]
We propose a novel spatially-correlative loss that is simple, efficient and yet effective for preserving scene structure consistency.
Previous methods attempt this by using pixel-level cycle-consistency or feature-level matching losses.
We show distinct improvement over baseline models in all three modes of unpaired I2I translation: single-modal, multi-modal, and even single-image translation.
arXiv Detail & Related papers (2021-04-02T02:13:30Z) - Image Inpainting with Learnable Feature Imputation [8.293345261434943]
A regular convolution layer applying a filter in the same way over known and unknown areas causes visual artifacts in the inpainted image.
We propose (layer-wise) feature imputation of the missing input values to a convolution.
We present comparisons on CelebA-HQ and Places2 to current state-of-the-art to validate our model.
arXiv Detail & Related papers (2020-11-02T16:05:32Z) - Adversarial Semantic Data Augmentation for Human Pose Estimation [96.75411357541438]
We propose Semantic Data Augmentation (SDA), a method that augments images by pasting segmented body parts with various semantic granularity.
We also propose Adversarial Semantic Data Augmentation (ASDA), which exploits a generative network to dynamiclly predict tailored pasting configuration.
State-of-the-art results are achieved on challenging benchmarks.
arXiv Detail & Related papers (2020-08-03T07:56:04Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.