Changing the Narrative Perspective: From Deictic to Anaphoric Point of
View
- URL: http://arxiv.org/abs/2103.04176v1
- Date: Sat, 6 Mar 2021 19:03:42 GMT
- Title: Changing the Narrative Perspective: From Deictic to Anaphoric Point of
View
- Authors: Mike Chen and Razvan Bunescu
- Abstract summary: We introduce the task of changing the narrative point of view, where characters are assigned a narrative perspective that is different from the one originally used by the writer.
The resulting shift in the narrative point of view alters the reading experience and can be used as a tool in fiction writing.
We describe a pipeline for processing raw text that relies on a neural architecture for mention selection.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We introduce the task of changing the narrative point of view, where
characters are assigned a narrative perspective that is different from the one
originally used by the writer. The resulting shift in the narrative point of
view alters the reading experience and can be used as a tool in fiction writing
or to generate types of text ranging from educational to self-help and
self-diagnosis. We introduce a benchmark dataset containing a wide range of
types of narratives annotated with changes in point of view from deictic (first
or second person) to anaphoric (third person) and describe a pipeline for
processing raw text that relies on a neural architecture for mention selection.
Evaluations on the new benchmark dataset show that the proposed architecture
substantially outperforms the baselines by generating mentions that are less
ambiguous and more natural.
Related papers
- Scene Graph Generation with Role-Playing Large Language Models [50.252588437973245]
Current approaches for open-vocabulary scene graph generation (OVSGG) use vision-language models such as CLIP.
We propose SDSGG, a scene-specific description based OVSGG framework.
To capture the complicated interplay between subjects and objects, we propose a new lightweight module called mutual visual adapter.
arXiv Detail & Related papers (2024-10-20T11:40:31Z) - Generating Visual Stories with Grounded and Coreferent Characters [63.07511918366848]
We present the first model capable of predicting visual stories with consistently grounded and coreferent character mentions.
Our model is finetuned on a new dataset which we build on top of the widely used VIST benchmark.
We also propose new evaluation metrics to measure the richness of characters and coreference in stories.
arXiv Detail & Related papers (2024-09-20T14:56:33Z) - Evolving Storytelling: Benchmarks and Methods for New Character Customization with Diffusion Models [79.21968152209193]
We introduce the NewEpisode benchmark to evaluate generative models' adaptability in generating new stories with fresh characters.
We propose EpicEvo, a method that customizes a diffusion-based visual story generation model with a single story featuring the new characters seamlessly integrating them into established character dynamics.
arXiv Detail & Related papers (2024-05-20T07:54:03Z) - WAVER: Writing-style Agnostic Text-Video Retrieval via Distilling
Vision-Language Models Through Open-Vocabulary Knowledge [12.034917651508524]
$texttWAVER$ is a cross-domain knowledge distillation framework via vision-language models.
$texttWAVER$ capitalizes on the open-vocabulary properties that lie in pre-trained vision-language models.
It can achieve state-of-the-art performance in text-video retrieval task while handling writing-style variations.
arXiv Detail & Related papers (2023-12-15T03:17:37Z) - Detecting and Grounding Important Characters in Visual Stories [18.870236356616907]
We introduce the VIST-Character dataset, which provides rich character-centric annotations.
Based on this dataset, we propose two new tasks: important character detection and character grounding in visual stories.
We develop simple, unsupervised models based on distributional similarity and pre-trained vision-and-language models.
arXiv Detail & Related papers (2023-03-30T18:24:06Z) - Integrating Visuospatial, Linguistic and Commonsense Structure into
Story Visualization [81.26077816854449]
We first explore the use of constituency parse trees for encoding structured input.
Second, we augment the structured input with commonsense information and study the impact of this external knowledge on the generation of visual story.
Third, we incorporate visual structure via bounding boxes and dense captioning to provide feedback about the characters/objects in generated images.
arXiv Detail & Related papers (2021-10-21T00:16:02Z) - Topical Change Detection in Documents via Embeddings of Long Sequences [4.13878392637062]
We formulate the task of text segmentation as an independent supervised prediction task.
By fine-tuning on paragraphs of similar sections, we are able to show that learned features encode topic information.
Unlike previous approaches, which mostly operate on sentence-level, we consistently use a broader context.
arXiv Detail & Related papers (2020-12-07T12:09:37Z) - Finding It at Another Side: A Viewpoint-Adapted Matching Encoder for
Change Captioning [41.044241265804125]
We propose a novel visual encoder to explicitly distinguish viewpoint changes from semantic changes in the change captioning task.
We also propose a novel reinforcement learning process to fine-tune the attention directly with language evaluation rewards.
Our method outperforms the state-of-the-art approaches by a large margin in both Spot-the-Diff and CLEVR-Change datasets.
arXiv Detail & Related papers (2020-09-30T00:13:49Z) - Temporal Embeddings and Transformer Models for Narrative Text
Understanding [72.88083067388155]
We present two approaches to narrative text understanding for character relationship modelling.
The temporal evolution of these relations is described by dynamic word embeddings, that are designed to learn semantic changes over time.
A supervised learning approach based on the state-of-the-art transformer model BERT is used instead to detect static relations between characters.
arXiv Detail & Related papers (2020-03-19T14:23:12Z) - Learning to Select Bi-Aspect Information for Document-Scale Text Content
Manipulation [50.01708049531156]
We focus on a new practical task, document-scale text content manipulation, which is the opposite of text style transfer.
In detail, the input is a set of structured records and a reference text for describing another recordset.
The output is a summary that accurately describes the partial content in the source recordset with the same writing style of the reference.
arXiv Detail & Related papers (2020-02-24T12:52:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.