Manga Generation via Layout-controllable Diffusion
- URL: http://arxiv.org/abs/2412.19303v1
- Date: Thu, 26 Dec 2024 17:52:19 GMT
- Title: Manga Generation via Layout-controllable Diffusion
- Authors: Siyu Chen, Dengjie Li, Zenghao Bao, Yao Zhou, Lingfeng Tan, Yujie Zhong, Zheng Zhao,
- Abstract summary: This paper presents the manga generation task and constructs the Manga109Story dataset for studying manga generation solely from plain text.
We propose MangaDiffusion to facilitate the intra-panel and inter-panel information interaction during the manga generation process.
- Score: 21.080054070512023
- License:
- Abstract: Generating comics through text is widely studied. However, there are few studies on generating multi-panel Manga (Japanese comics) solely based on plain text. Japanese manga contains multiple panels on a single page, with characteristics such as coherence in storytelling, reasonable and diverse page layouts, consistency in characters, and semantic correspondence between panel drawings and panel scripts. Therefore, generating manga poses a significant challenge. This paper presents the manga generation task and constructs the Manga109Story dataset for studying manga generation solely from plain text. Additionally, we propose MangaDiffusion to facilitate the intra-panel and inter-panel information interaction during the manga generation process. The results show that our method particularly ensures the number of panels, reasonable and diverse page layouts. Based on our approach, there is potential to converting a large amount of textual stories into more engaging manga readings, leading to significant application prospects.
Related papers
- How Panel Layouts Define Manga: Insights from Visual Ablation Experiments [24.408092528259424]
This paper aims to analyze the visual characteristics of manga works, with a particular focus on panel layout features.
As a research method, we used facing page images of manga as input to train a deep learning model for predicting manga titles.
Specifically, we conducted ablation studies by limiting page image information to panel frames to analyze the characteristics of panel layouts.
arXiv Detail & Related papers (2024-12-26T09:53:37Z) - Tails Tell Tales: Chapter-Wide Manga Transcriptions with Character Names [53.24414727354768]
This paper aims to generate a dialogue transcript of a complete manga chapter, entirely automatically.
It involves identifying (i) what is being said, detecting the texts on each page and classifying them into essential vs non-essential.
It also ensures the same characters are named consistently throughout the chapter.
arXiv Detail & Related papers (2024-08-01T05:47:04Z) - MangaUB: A Manga Understanding Benchmark for Large Multimodal Models [25.63892470012361]
Manga is a popular medium that combines stylized drawings and text to convey stories.
Recently, the adaptive nature of modern large multimodal models (LMMs) shows possibilities for more general approaches.
MangaUB is designed to assess the recognition and understanding of content shown in a single panel as well as conveyed across multiple panels.
arXiv Detail & Related papers (2024-07-26T18:21:30Z) - The Manga Whisperer: Automatically Generating Transcriptions for Comics [55.544015596503726]
We present a unified model, Magi, that is able to detect panels, text boxes and character boxes.
We propose a novel approach that is able to sort the detected text boxes in their reading order and generate a dialogue transcript.
arXiv Detail & Related papers (2024-01-18T18:59:09Z) - inkn'hue: Enhancing Manga Colorization from Multiple Priors with
Alignment Multi-Encoder VAE [0.0]
We propose a specialized framework for manga colorization.
We leverage established models for shading and vibrant coloring using a multi-encoder VAE.
This structured workflow ensures clear and colorful results, with the option to incorporate reference images and manual hints.
arXiv Detail & Related papers (2023-11-03T09:33:32Z) - M2C: Towards Automatic Multimodal Manga Complement [40.01354682367365]
Multimodal manga analysis focuses on enhancing manga understanding with visual and textual features.
Currently, most comics are hand-drawn and prone to problems such as missing pages, text contamination, and aging.
We first propose the Multimodal Manga Complement task by establishing a new M2C benchmark dataset covering two languages.
arXiv Detail & Related papers (2023-10-26T04:10:16Z) - Deep Geometrized Cartoon Line Inbetweening [98.35956631655357]
Inbetweening involves generating intermediate frames between two black-and-white line drawings.
Existing frame methods that rely on matching and warping whole images are unsuitable for line inbetweening.
We propose AnimeInbet, which geometrizes geometric line drawings into endpoints and reframes the inbetweening task as a graph fusion problem.
Our method can effectively capture the sparsity and unique structure of line drawings while preserving the details during inbetweening.
arXiv Detail & Related papers (2023-09-28T17:50:05Z) - Dense Multitask Learning to Reconfigure Comics [63.367664789203936]
We develop a MultiTask Learning (MTL) model to achieve dense predictions for comics panels.
Our method can successfully identify the semantic units as well as the notion of 3D in comic panels.
arXiv Detail & Related papers (2023-07-16T15:10:34Z) - AniGAN: Style-Guided Generative Adversarial Networks for Unsupervised
Anime Face Generation [84.52819242283852]
We propose a novel framework to translate a portrait photo-face into an anime appearance.
Our aim is to synthesize anime-faces which are style-consistent with a given reference anime-face.
Existing methods often fail to transfer the styles of reference anime-faces, or introduce noticeable artifacts/distortions in the local shapes of their generated faces.
arXiv Detail & Related papers (2021-02-24T22:47:38Z) - MangaGAN: Unpaired Photo-to-Manga Translation Based on The Methodology
of Manga Drawing [27.99490750445691]
We propose MangaGAN, the first method based on Generative Adversarial Network (GAN) for unpaired photo-to-manga translation.
Inspired by how experienced manga artists draw manga, MangaGAN generates the geometric features of manga face by a designed GAN model.
To produce high-quality manga faces, we propose a structural smoothing loss to smooth stroke-lines and avoid noisy pixels, and a similarity preserving module.
arXiv Detail & Related papers (2020-04-22T15:23:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.