ReelFramer: Human-AI Co-Creation for News-to-Video Translation
- URL: http://arxiv.org/abs/2304.09653v3
- Date: Mon, 11 Mar 2024 03:06:46 GMT
- Title: ReelFramer: Human-AI Co-Creation for News-to-Video Translation
- Authors: Sitong Wang, Samia Menon, Tao Long, Keren Henderson, Dingzeyu Li,
Kevin Crowston, Mark Hansen, Jeffrey V. Nickerson, Lydia B. Chilton
- Abstract summary: We introduce ReelFramer, a human-AI co-creative system that helps journalists translate print articles into scripts and storyboards.
narrative framing introduces the necessary diversity to translate various articles into reels, and establishes details helps generate scripts that are more relevant and coherent.
- Score: 18.981919581170175
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Short videos on social media are the dominant way young people consume
content. News outlets aim to reach audiences through news reels -- short videos
conveying news -- but struggle to translate traditional journalistic formats
into short, entertaining videos. To translate news into social media reels, we
support journalists in reframing the narrative. In literature, narrative
framing is a high-level structure that shapes the overall presentation of a
story. We identified three narrative framings for reels that adapt social media
norms but preserve news value, each with a different balance of information and
entertainment. We introduce ReelFramer, a human-AI co-creative system that
helps journalists translate print articles into scripts and storyboards.
ReelFramer supports exploring multiple narrative framings to find one
appropriate to the story. AI suggests foundational narrative details, including
characters, plot, setting, and key information. ReelFramer also supports visual
framing; AI suggests character and visual detail designs before generating a
full storyboard. Our studies show that narrative framing introduces the
necessary diversity to translate various articles into reels, and establishing
foundational details helps generate scripts that are more relevant and
coherent. We also discuss the benefits of using narrative framing and
foundational details in content retargeting.
Related papers
- StoryAgent: Customized Storytelling Video Generation via Multi-Agent Collaboration [88.94832383850533]
We propose a multi-agent framework designed for Customized Storytelling Video Generation (CSVG)
StoryAgent decomposes CSVG into distinct subtasks assigned to specialized agents, mirroring the professional production process.
Specifically, we introduce a customized Image-to-Video (I2V) method, LoRA-BE, to enhance intra-shot temporal consistency.
Our contributions include the introduction of StoryAgent, a versatile framework for video generation tasks, and novel techniques for preserving protagonist consistency.
arXiv Detail & Related papers (2024-11-07T18:00:33Z) - Agents' Room: Narrative Generation through Multi-step Collaboration [54.98886593802834]
We propose a generation framework inspired by narrative theory that decomposes narrative writing into subtasks tackled by specialized agents.
We show that Agents' Room generates stories preferred by expert evaluators over those produced by baseline systems.
arXiv Detail & Related papers (2024-10-03T15:44:42Z) - The Lost Melody: Empirical Observations on Text-to-Video Generation From A Storytelling Perspective [4.471962177124311]
We examine text-to-video generation from a storytelling perspective, which has been hardly investigated.
We propose an evaluation framework for storytelling aspects of videos, and discuss the potential future directions.
arXiv Detail & Related papers (2024-05-13T02:25:08Z) - Movie101v2: Improved Movie Narration Benchmark [53.54176725112229]
Automatic movie narration aims to generate video-aligned plot descriptions to assist visually impaired audiences.
We introduce Movie101v2, a large-scale, bilingual dataset with enhanced data quality specifically designed for movie narration.
Based on our new benchmark, we baseline a range of large vision-language models, including GPT-4V, and conduct an in-depth analysis of the challenges in narration generation.
arXiv Detail & Related papers (2024-04-20T13:15:27Z) - Multi-modal News Understanding with Professionally Labelled Videos
(ReutersViLNews) [25.78619140103048]
We present a large-scale analysis on an in-house dataset collected by the Reuters News Agency, called Reuters Video-Language News (ReutersViLNews) dataset.
The dataset focuses on high-level video-language understanding with an emphasis on long-form news.
The results suggest that news-oriented videos are a substantial challenge for current video-language understanding algorithms.
arXiv Detail & Related papers (2024-01-23T00:42:04Z) - Shot2Story20K: A New Benchmark for Comprehensive Understanding of
Multi-shot Videos [58.13927287437394]
We present a new multi-shot video understanding benchmark Shot2Story20K with detailed shot-level captions and comprehensive video summaries.
Preliminary experiments show some challenges to generate a long and comprehensive video summary.
arXiv Detail & Related papers (2023-12-16T03:17:30Z) - StoryBench: A Multifaceted Benchmark for Continuous Story Visualization [42.439670922813434]
We introduce StoryBench: a new, challenging multi-task benchmark to reliably evaluate text-to-video models.
Our benchmark includes three video generation tasks of increasing difficulty: action execution, story continuation, and story generation.
We evaluate small yet strong text-to-video baselines, and show the benefits of training on story-like data algorithmically generated from existing video captions.
arXiv Detail & Related papers (2023-08-22T17:53:55Z) - Connecting Vision and Language with Video Localized Narratives [54.094554472715245]
We propose Video Localized Narratives, a new form of multimodal video annotations connecting vision and language.
In the original Localized Narratives, annotators speak and move their mouse simultaneously on an image, thus grounding each word with a mouse trace segment.
Our new protocol empowers annotators to tell the story of a video with Localized Narratives, capturing even complex events involving multiple actors interacting with each other and with several passive objects.
arXiv Detail & Related papers (2023-02-22T09:04:00Z) - Narration Generation for Cartoon Videos [35.814965300322015]
We propose a new task, narration generation, that is complementing videos with narration texts that are to be interjected in several places.
We collect a new dataset from the animated television series Peppa Pig.
arXiv Detail & Related papers (2021-01-17T23:23:09Z) - CompRes: A Dataset for Narrative Structure in News [2.4578723416255754]
We introduce CompRes -- the first dataset for narrative structure in news media.
We use the annotated dataset to train several supervised models to identify the different narrative elements.
arXiv Detail & Related papers (2020-07-09T15:21:59Z) - Text Synopsis Generation for Egocentric Videos [72.52130695707008]
We propose to generate a textual synopsis, consisting of a few sentences describing the most important events in a long egocentric videos.
Users can read the short text to gain insight about the video, and more importantly, efficiently search through the content of a large video database.
arXiv Detail & Related papers (2020-05-08T00:28:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.