R$^2$: A LLM Based Novel-to-Screenplay Generation Framework with Causal Plot Graphs
- URL: http://arxiv.org/abs/2503.15655v1
- Date: Wed, 19 Mar 2025 19:09:40 GMT
- Title: R$^2$: A LLM Based Novel-to-Screenplay Generation Framework with Causal Plot Graphs
- Authors: Zefeng Lin, Yi Xiao, Zhiqiang Mo, Qifan Zhang, Jie Wang, Jiayang Chen, Jiajing Zhang, Hui Zhang, Zhengyi Liu, Xianyong Fang, Xiaohua Xu,
- Abstract summary: We propose a framework to automatically adapt novels into screenplays based on large language models (LLMs)<n>The causality-embedded plot lines should be effectively extracted for coherent rewriting.<n>Two corresponding tactics are proposed: 1) A-aware refinement method (HAR) to iteratively discover and eliminate the affections of hallucinations; and 2) a causal plot-graph construction method ( CPC) based on a greedy cycle-breaking algorithm to efficiently construct plot lines with event causalities.
- Score: 12.751879151553918
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automatically adapting novels into screenplays is important for the TV, film, or opera industries to promote products with low costs. The strong performances of large language models (LLMs) in long-text generation call us to propose a LLM based framework Reader-Rewriter (R$^2$) for this task. However, there are two fundamental challenges here. First, the LLM hallucinations may cause inconsistent plot extraction and screenplay generation. Second, the causality-embedded plot lines should be effectively extracted for coherent rewriting. Therefore, two corresponding tactics are proposed: 1) A hallucination-aware refinement method (HAR) to iteratively discover and eliminate the affections of hallucinations; and 2) a causal plot-graph construction method (CPC) based on a greedy cycle-breaking algorithm to efficiently construct plot lines with event causalities. Recruiting those efficient techniques, R$^2$ utilizes two modules to mimic the human screenplay rewriting process: The Reader module adopts a sliding window and CPC to build the causal plot graphs, while the Rewriter module generates first the scene outlines based on the graphs and then the screenplays. HAR is integrated into both modules for accurate inferences of LLMs. Experimental results demonstrate the superiority of R$^2$, which substantially outperforms three existing approaches (51.3%, 22.6%, and 57.1% absolute increases) in pairwise comparison at the overall win rate for GPT-4o.
Related papers
- Compile Scene Graphs with Reinforcement Learning [69.36723767339001]
Next token prediction is the fundamental principle for training large language models (LLMs)
We introduce R1-SGG, a multimodal LLM (M-LLM) trained via supervised fine-tuning (SFT) on the scene graph dataset.
We design a graph-centric reward function that integrates node-level rewards, edge-level rewards, and a format consistency reward.
arXiv Detail & Related papers (2025-04-18T10:46:22Z) - IGDA: Interactive Graph Discovery through Large Language Model Agents [6.704529554100875]
Large language models ($textbfLLMs$) have emerged as a powerful method for discovery.<n>We propose $textbfIGDA$ to be a powerful method for graph discovery complementary to existing numerically driven approaches.
arXiv Detail & Related papers (2025-02-24T14:24:27Z) - Zero-Shot Statistical Tests for LLM-Generated Text Detection using Finite Sample Concentration Inequalities [13.657259851747126]
Verifying the provenance of content is crucial to the function of many organizations, e.g., educational institutions, social media platforms, firms, etc.
This problem is becoming increasingly challenging as text generated by Large Language Models (LLMs) becomes almost indistinguishable from human-generated content.
We show that our tests' type I and type II errors decrease exponentially as text length increases.
Practically, our work enables guaranteed finding of the origin of harmful or false LLM-generated text, which can be useful for combating misinformation and compliance with emerging AI regulations.
arXiv Detail & Related papers (2025-01-04T23:51:43Z) - Accelerating Multimodal Large Language Models by Searching Optimal Vision Token Reduction [62.8375542401319]
Multimodal Large Language Models (MLLMs) encode the input image(s) as vision tokens and feed them into the language backbone.<n>The number of vision tokens increases quadratically as the image resolutions, leading to huge computational costs.<n>We propose a greedy search algorithm (G-Search) to find the least number of vision tokens to keep at each layer from the shallow to the deep.
arXiv Detail & Related papers (2024-11-30T18:54:32Z) - FiSTECH: Financial Style Transfer to Enhance Creativity without Hallucinations in LLMs [0.3958317527488534]
We explore the self-corrective auto-regressive qualities of large language models (LLMs) to learn creativity in writing styles with minimal prompting.
We propose a novel two-stage fine-tuning (FT) strategy wherein in the first stage public domain financial reports are used to train for writing styles while allowing the LLM to hallucinate.
Our proposed two-stage fine-tuning boosts the accuracy of financial questions answering by two-folds while reducing hallucinations by over 50%.
arXiv Detail & Related papers (2024-08-09T22:29:23Z) - Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots [66.95139377783966]
We introduce Plot2Code, a comprehensive visual coding benchmark for Multi-modal Large Language Models.
We collect 132 manually selected high-quality matplotlib plots across six plot types from publicly available matplotlib galleries.
For each plot, we carefully offer its source code, and an descriptive instruction summarized by GPT-4.
arXiv Detail & Related papers (2024-05-13T17:59:22Z) - LLMRefine: Pinpointing and Refining Large Language Models via Fine-Grained Actionable Feedback [65.84061725174269]
Recent large language models (LLM) are leveraging human feedback to improve their generation quality.
We propose LLMRefine, an inference time optimization method to refine LLM's output.
We conduct experiments on three text generation tasks, including machine translation, long-form question answering (QA), and topical summarization.
LLMRefine consistently outperforms all baseline approaches, achieving improvements up to 1.7 MetricX points on translation tasks, 8.1 ROUGE-L on ASQA, 2.2 ROUGE-L on topical summarization.
arXiv Detail & Related papers (2023-11-15T19:52:11Z) - Integrating Graphs with Large Language Models: Methods and Prospects [68.37584693537555]
Large language models (LLMs) have emerged as frontrunners, showcasing unparalleled prowess in diverse applications.
Merging the capabilities of LLMs with graph-structured data has been a topic of keen interest.
This paper bifurcates such integrations into two predominant categories.
arXiv Detail & Related papers (2023-10-09T07:59:34Z) - BooookScore: A systematic exploration of book-length summarization in the era of LLMs [53.42917858142565]
We develop an automatic metric, BooookScore, that measures the proportion of sentences in a summary that do not contain any of the identified error types.
We find that closed-source LLMs such as GPT-4 and 2 produce summaries with higher BooookScore than those generated by open-source models.
arXiv Detail & Related papers (2023-10-01T20:46:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.