Related papers: Crossing Borders: A Multimodal Challenge for Indian Poetry Translation and Image Generation

Crossing Borders: A Multimodal Challenge for Indian Poetry Translation and Image Generation

URL: http://arxiv.org/abs/2511.13689v2
Date: Tue, 18 Nov 2025 04:27:26 GMT
Title: Crossing Borders: A Multimodal Challenge for Indian Poetry Translation and Image Generation
Authors: Sofia Jamil, Kotla Sai Charan, Sriparna Saha, Koustava Goswami, Joseph K J,
Abstract summary: Indian poetry, known for its linguistic complexity and deep cultural resonance, has a rich and varied heritage spanning thousands of years.<n>Despite its cultural significance, existing works on poetry have largely overlooked Indian language poems.<n>We propose the Translation and Image Generation (TAI) framework, leveraging Large Language Models (LLMs) and Latent Diffusion Models.
Score: 14.583411423291233
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Indian poetry, known for its linguistic complexity and deep cultural resonance, has a rich and varied heritage spanning thousands of years. However, its layered meanings, cultural allusions, and sophisticated grammatical constructions often pose challenges for comprehension, especially for non-native speakers or readers unfamiliar with its context and language. Despite its cultural significance, existing works on poetry have largely overlooked Indian language poems. In this paper, we propose the Translation and Image Generation (TAI) framework, leveraging Large Language Models (LLMs) and Latent Diffusion Models through appropriate prompt tuning. Our framework supports the United Nations Sustainable Development Goals of Quality Education (SDG 4) and Reduced Inequalities (SDG 10) by enhancing the accessibility of culturally rich Indian-language poetry to a global audience. It includes (1) a translation module that uses an Odds Ratio Preference Alignment Algorithm to accurately translate morphologically rich poetry into English, and (2) an image generation module that employs a semantic graph to capture tokens, dependencies, and semantic relationships between metaphors and their meanings, to create visually meaningful representations of Indian poems. Our comprehensive experimental evaluation, including both human and quantitative assessments, demonstrates the superiority of TAI Diffusion in poem image generation tasks, outperforming strong baselines. To further address the scarcity of resources for Indian-language poetry, we introduce the Morphologically Rich Indian Language Poems MorphoVerse Dataset, comprising 1,570 poems across 21 low-resource Indian languages. By addressing the gap in poetry translation and visual comprehension, this work aims to broaden accessibility and enrich the reader's experience.

Related papers

PoemTale Diffusion: Minimising Information Loss in Poem to Image Generation with Multi-Stage Prompt Refinement [18.293592213622183]
PoemTale Diffusion aims to minimise the information that is lost during poetic text-to-image conversion.<n>To support this, we adapt existing state-of-the-art diffusion models by modifying their self-attention mechanisms.<n>To encourage research in the field of poetry, we introduce the P4I dataset, consisting of 1111 poems.
arXiv Detail & Related papers (2025-07-18T07:33:08Z)
Picturized and Recited with Dialects: A Multimodal Chinese Representation Framework for Sentiment Analysis of Classical Chinese Poetry [7.374104697960381]
We propose a dialect-enhanced multimodal framework for classical Chinese poetry sentiment analysis.<n>We extract sentence-level audio features from the poetry and incorporate audio from multiple dialects.<n>Our framework outperforms state-of-the-art methods on two public datasets.
arXiv Detail & Related papers (2025-05-19T14:58:44Z)
Seedream 2.0: A Native Chinese-English Bilingual Image Generation Foundation Model [69.09404597939744]
Seedream 2.0 is a native Chinese-English bilingual image generation foundation model.<n>It adeptly manages text prompt in both Chinese and English, supporting bilingual image generation and text rendering.<n>It is integrated with a self-developed bilingual large language model as a text encoder, allowing it to learn native knowledge directly from massive data.
arXiv Detail & Related papers (2025-03-10T17:58:33Z)
Vietnamese Poem Generation & The Prospect Of Cross-Language Poem-To-Poem Translation [0.0]
We propose using Large Language Models to generate Vietnamese poems from natural language prompts. The GPT-3 Babbage variant achieves a custom evaluation score of 0.8, specifically tailored to the "luc bat" genre of Vietnamese poetry.
arXiv Detail & Related papers (2024-01-02T07:46:34Z)
NusaWrites: Constructing High-Quality Corpora for Underrepresented and Extremely Low-Resource Languages [54.808217147579036]
We conduct a case study on Indonesian local languages. We compare the effectiveness of online scraping, human translation, and paragraph writing by native speakers in constructing datasets. Our findings demonstrate that datasets generated through paragraph writing by native speakers exhibit superior quality in terms of lexical diversity and cultural content.
arXiv Detail & Related papers (2023-09-19T14:42:33Z)
PoetryDiffusion: Towards Joint Semantic and Metrical Manipulation in Poetry Generation [58.36105306993046]
Controllable text generation is a challenging and meaningful field in natural language generation (NLG) In this paper, we pioneer the use of the Diffusion model for generating sonnets and Chinese SongCi poetry. Our model outperforms existing models in automatic evaluation of semantic, metrical, and overall performance as well as human evaluation.
arXiv Detail & Related papers (2023-06-14T11:57:31Z)
Multi-lingual and Multi-cultural Figurative Language Understanding [69.47641938200817]
Figurative language permeates human communication, but is relatively understudied in NLP. We create a dataset for seven diverse languages associated with a variety of cultures: Hindi, Indonesian, Javanese, Kannada, Sundanese, Swahili and Yoruba. Our dataset reveals that each language relies on cultural and regional concepts for figurative expressions, with the highest overlap between languages originating from the same region. All languages exhibit a significant deficiency compared to English, with variations in performance reflecting the availability of pre-training and fine-tuning data.
arXiv Detail & Related papers (2023-05-25T15:30:31Z)
Semantics of European poetry is shaped by conservative forces: The relationship between poetic meter and meaning in accentual-syllabic verse [0.0]
We provide the first large-scale formal evidence of the persistent association between poetic meter and semantics in 18-19th European literatures. Our study traces this association through a series of clustering experiments using the abstracted semantic features of 150,000 poems.
arXiv Detail & Related papers (2021-09-15T08:20:01Z)
Don't Go Far Off: An Empirical Study on Neural Poetry Translation [13.194404923699782]
We present an empirical investigation for poetry translation along several dimensions. We contribute a parallel dataset of poetry translations for several language pairs. Our results show that multilingual fine-tuning on poetic text significantly outperforms multilingual fine-tuning on non-poetic text that is 35X larger in size.
arXiv Detail & Related papers (2021-09-07T10:00:44Z)
CCPM: A Chinese Classical Poetry Matching Dataset [50.90794811956129]
We propose a novel task to assess a model's semantic understanding of poetry by poem matching. This task requires the model to select one line of Chinese classical poetry among four candidates according to the modern Chinese translation of a line of poetry. To construct this dataset, we first obtain a set of parallel data of Chinese classical poetry and modern Chinese translation.
arXiv Detail & Related papers (2021-06-03T16:49:03Z)
MixPoet: Diverse Poetry Generation via Learning Controllable Mixed Latent Space [79.70053419040902]
We propose MixPoet, a novel model that absorbs multiple factors to create various styles and promote diversity. Based on a semi-supervised variational autoencoder, our model disentangles the latent space into some subspaces, with each conditioned on one influence factor by adversarial training. Experiment results on Chinese poetry demonstrate that MixPoet improves both diversity and quality against three state-of-the-art models.
arXiv Detail & Related papers (2020-03-13T03:31:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.