Characterizing Stage-Aware Writing Assistance in Collaborative Document
Authoring
- URL: http://arxiv.org/abs/2008.08165v1
- Date: Tue, 18 Aug 2020 21:48:04 GMT
- Title: Characterizing Stage-Aware Writing Assistance in Collaborative Document
Authoring
- Authors: Bahareh Sarrafzadeh, Sujay Kumar Jauhar, Michael Gamon, Edward Lank,
and Ryen White
- Abstract summary: We present three studies that explore temporal stages of document authoring.
We conclude that writers do in fact conceptually progress through several distinct phases while authoring documents.
As a first step towards facilitating an intelligent digital writing assistant, we conduct a preliminary investigation into the utility of user interaction log data for predicting the temporal stage of a document.
- Score: 14.512030721220437
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Writing is a complex non-linear process that begins with a mental model of
intent, and progresses through an outline of ideas, to words on paper (and
their subsequent refinement). Despite past research in understanding writing,
Web-scale consumer and enterprise collaborative digital writing environments
are yet to greatly benefit from intelligent systems that understand the stages
of document evolution, providing opportune assistance based on authors'
situated actions and context. In this paper, we present three studies that
explore temporal stages of document authoring. We first survey information
workers at a large technology company about their writing habits and
preferences, concluding that writers do in fact conceptually progress through
several distinct phases while authoring documents. We also explore,
qualitatively, how writing stages are linked to document lifespan. We
supplement these qualitative findings with an analysis of the longitudinal user
interaction logs of a popular digital writing platform over several million
documents. Finally, as a first step towards facilitating an intelligent digital
writing assistant, we conduct a preliminary investigation into the utility of
user interaction log data for predicting the temporal stage of a document. Our
results support the benefit of tools tailored to writing stages, identify
primary tasks associated with these stages, and show that it is possible to
predict stages from anonymous interaction logs. Together, these results argue
for the benefit and feasibility of more tailored digital writing assistance.
Related papers
- BookWorm: A Dataset for Character Description and Analysis [59.186325346763184]
We define two tasks: character description, which generates a brief factual profile, and character analysis, which offers an in-depth interpretation.
We introduce the BookWorm dataset, pairing books from the Gutenberg Project with human-written descriptions and analyses.
Our findings show that retrieval-based approaches outperform hierarchical ones in both tasks.
arXiv Detail & Related papers (2024-10-14T10:55:58Z) - A Novel Dataset for Non-Destructive Inspection of Handwritten Documents [0.0]
Forensic handwriting examination aims to examine handwritten documents in order to properly define or hypothesize the manuscript's author.
We propose a new and challenging dataset consisting of two subsets: the first consists of 21 documents written either by the classic pen and paper" approach (and later digitized) and directly acquired on common devices such as tablets.
Preliminary results on the proposed datasets show that 90% classification accuracy can be achieved on the first subset.
arXiv Detail & Related papers (2024-01-09T09:25:58Z) - Innovative Methods for Non-Destructive Inspection of Handwritten
Documents [0.0]
We present a framework capable of extracting and analyzing intrinsic measures of manuscript documents using image processing and deep learning techniques.
By quantifying the Euclidean distance between the feature vectors of the documents to be compared, authorship can be discerned.
Experimental results demonstrate the ability of our method to objectively determine authorship in different writing media, outperforming the state of the art.
arXiv Detail & Related papers (2023-10-17T12:45:04Z) - An end-to-end, interactive Deep Learning based Annotation system for
cursive and print English handwritten text [0.0]
We present an innovative, complete end-to-end pipeline, that annotates offline handwritten manuscripts written in both print and cursive English.
This novel method involves an architectural combination of a detection system built upon a state-of-the-art text detection model, and a custom made Deep Learning model for the recognition system.
arXiv Detail & Related papers (2023-04-18T00:24:07Z) - Exploitation and exploration in text evolution. Quantifying planning and
translation flows during writing [0.13108652488669734]
We introduce measures to quantify subcycles of planning (exploration) and translation (exploitation) during the writing process.
This dataset comes from a series of writing workshops in which, through innovative versioning software, we were able to record all the steps in the construction of a text.
arXiv Detail & Related papers (2023-02-07T17:52:33Z) - Unified Pretraining Framework for Document Understanding [52.224359498792836]
We present UDoc, a new unified pretraining framework for document understanding.
UDoc is designed to support most document understanding tasks, extending the Transformer to take multimodal embeddings as input.
An important feature of UDoc is that it learns a generic representation by making use of three self-supervised losses.
arXiv Detail & Related papers (2022-04-22T21:47:04Z) - Toward Educator-focused Automated Scoring Systems for Reading and
Writing [0.0]
This paper addresses the challenges of data and label availability, authentic and extended writing, domain scoring, prompt and source variety, and transfer learning.
It employs techniques that preserve essay length as an important feature without increasing model training costs.
arXiv Detail & Related papers (2021-12-22T15:44:30Z) - CitationIE: Leveraging the Citation Graph for Scientific Information
Extraction [89.33938657493765]
We use the citation graph of referential links between citing and cited papers.
We observe a sizable improvement in end-to-end information extraction over the state-of-the-art.
arXiv Detail & Related papers (2021-06-03T03:00:12Z) - A Survey of Deep Learning Approaches for OCR and Document Understanding [68.65995739708525]
We review different techniques for document understanding for documents written in English.
We consolidate methodologies present in literature to act as a jumping-off point for researchers exploring this area.
arXiv Detail & Related papers (2020-11-27T03:05:59Z) - Explaining Relationships Between Scientific Documents [55.23390424044378]
We address the task of explaining relationships between two scientific documents using natural language text.
In this paper we establish a dataset of 622K examples from 154K documents.
arXiv Detail & Related papers (2020-02-02T03:54:47Z) - Conversations with Documents. An Exploration of Document-Centered
Assistance [55.60379539074692]
Document-centered assistance, for example, to help an individual quickly review a document, has seen less significant progress.
We present a survey to understand the space of document-centered assistance and the capabilities people expect in this scenario.
We present a set of initial machine learned models that show that (a) we can accurately detect document-centered questions, and (b) we can build reasonably accurate models for answering such questions.
arXiv Detail & Related papers (2020-01-27T17:10:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.