Aesthetic Photo Collage with Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2110.09775v1
- Date: Tue, 19 Oct 2021 07:34:48 GMT
- Title: Aesthetic Photo Collage with Deep Reinforcement Learning
- Authors: Mingrui Zhang, Mading Li, Li Chen, Jiahao Yu
- Abstract summary: Photo collage aims to automatically arrange multiple photos on a given canvas with high aesthetic quality.
Deep learning provides a promising way, but owing to the complexity of collage and lack of training data, a solution has yet to be found.
We propose a novel pipeline for automatic generation of aspect ratio specified collage and the reinforcement learning technique is introduced in collage for the first time.
- Score: 16.523810962786705
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Photo collage aims to automatically arrange multiple photos on a given canvas
with high aesthetic quality. Existing methods are based mainly on handcrafted
feature optimization, which cannot adequately capture high-level human
aesthetic senses. Deep learning provides a promising way, but owing to the
complexity of collage and lack of training data, a solution has yet to be
found. In this paper, we propose a novel pipeline for automatic generation of
aspect ratio specified collage and the reinforcement learning technique is
introduced in collage for the first time. Inspired by manual collages, we model
the collage generation as sequential decision process to adjust spatial
positions, orientation angles, placement order and the global layout. To
instruct the agent to improve both the overall layout and local details, the
reward function is specially designed for collage, considering subjective and
objective factors. To overcome the lack of training data, we pretrain our deep
aesthetic network on a large scale image aesthetic dataset (CPC) for general
aesthetic feature extraction and propose an attention fusion module for
structural collage feature representation. We test our model against competing
methods on two movie datasets and our results outperform others in aesthetic
quality evaluation. Further user study is also conducted to demonstrate the
effectiveness.
Related papers
- Learning to Compose: Improving Object Centric Learning by Injecting Compositionality [27.364435779446072]
compositional representation is a key aspect of object-centric learning.
Most of the existing approaches rely on auto-encoding objective.
We propose a novel objective that explicitly encourages compositionality of the representations.
arXiv Detail & Related papers (2024-05-01T17:21:36Z) - ObjBlur: A Curriculum Learning Approach With Progressive Object-Level Blurring for Improved Layout-to-Image Generation [7.645341879105626]
We present Blur, a novel curriculum learning approach to improve layout-to-image generation models.
Our method is based on progressive object-level blurring, which effectively stabilizes training and enhances the quality of generated images.
arXiv Detail & Related papers (2024-04-11T08:50:12Z) - Neural Collage Transfer: Artistic Reconstruction via Material
Manipulation [20.72219392904935]
Collage is a creative art form that uses diverse material scraps as a base unit to compose a single image.
pixel-wise generation techniques can reproduce a target image in collage style, but it is not a suitable method due to the solid stroke-by-stroke nature of the collage form.
We propose a method for learning to make collages via reinforcement learning without the need for demonstrations or collage artwork data.
arXiv Detail & Related papers (2023-11-03T19:10:37Z) - Cones 2: Customizable Image Synthesis with Multiple Subjects [50.54010141032032]
We study how to efficiently represent a particular subject as well as how to appropriately compose different subjects.
By rectifying the activations in the cross-attention map, the layout appoints and separates the location of different subjects in the image.
arXiv Detail & Related papers (2023-05-30T18:00:06Z) - Controllable Person Image Synthesis with Spatially-Adaptive Warped
Normalization [72.65828901909708]
Controllable person image generation aims to produce realistic human images with desirable attributes.
We introduce a novel Spatially-Adaptive Warped Normalization (SAWN), which integrates a learned flow-field to warp modulation parameters.
We propose a novel self-training part replacement strategy to refine the pretrained model for the texture-transfer task.
arXiv Detail & Related papers (2021-05-31T07:07:44Z) - Deep Image Compositing [93.75358242750752]
We propose a new method which can automatically generate high-quality image composites without any user input.
Inspired by Laplacian pyramid blending, a dense-connected multi-stream fusion network is proposed to effectively fuse the information from the foreground and background images.
Experiments show that the proposed method can automatically generate high-quality composites and outperforms existing methods both qualitatively and quantitatively.
arXiv Detail & Related papers (2020-11-04T06:12:24Z) - Bridging Composite and Real: Towards End-to-end Deep Image Matting [88.79857806542006]
We study the roles of semantics and details for image matting.
We propose a novel Glance and Focus Matting network (GFM), which employs a shared encoder and two separate decoders.
Comprehensive empirical studies have demonstrated that GFM outperforms state-of-the-art methods.
arXiv Detail & Related papers (2020-10-30T10:57:13Z) - Perspective Plane Program Induction from a Single Image [85.28956922100305]
We study the inverse graphics problem of inferring a holistic representation for natural images.
We formulate this problem as jointly finding the camera pose and scene structure that best describe the input image.
Our proposed framework, Perspective Plane Program Induction (P3I), combines search-based and gradient-based algorithms to efficiently solve the problem.
arXiv Detail & Related papers (2020-06-25T21:18:58Z) - AlphaNet: An Attention Guided Deep Network for Automatic Image Matting [0.0]
We propose an end to end solution for image matting i.e. high-precision extraction of foreground objects from natural images.
We propose a method that assimilates semantic segmentation and deep image matting processes into a single network to generate semantic mattes.
We also construct a fashion e-commerce focused dataset with high-quality alpha mattes to facilitate the training and evaluation for image matting.
arXiv Detail & Related papers (2020-03-07T17:25:21Z) - SketchyCOCO: Image Generation from Freehand Scene Sketches [71.85577739612579]
We introduce the first method for automatic image generation from scene-level freehand sketches.
Key contribution is an attribute vector bridged Geneversarative Adrial Network called EdgeGAN.
We have built a large-scale composite dataset called SketchyCOCO to support and evaluate the solution.
arXiv Detail & Related papers (2020-03-05T14:54:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.