Understanding How Paper Writers Use AI-Generated Captions in Figure Caption Writing
- URL: http://arxiv.org/abs/2501.06317v1
- Date: Fri, 10 Jan 2025 19:39:06 GMT
- Title: Understanding How Paper Writers Use AI-Generated Captions in Figure Caption Writing
- Authors: Ho Yin, Ng, Ting-Yao Hsu, Jiyoo Min, Sungchul Kim, Ryan A. Rossi, Tong Yu, Hyunggu Jung, Ting-Hao 'Kenneth' Huang,
- Abstract summary: This paper investigates how paper authors incorporate AI-generated captions into their writing process through a user study involving 18 participants.
By analyzing video recordings of the writing process through interaction analysis, we observed that participants often began by copying and refining AI-generated captions.
Paper writers favored longer, detail-rich captions that integrated textual and visual elements but found current AI models less effective for complex figures.
- Score: 38.53604094994033
- License:
- Abstract: Figures and their captions play a key role in scientific publications. However, despite their importance, many captions in published papers are poorly crafted, largely due to a lack of attention by paper authors. While prior AI research has explored caption generation, it has mainly focused on reader-centered use cases, where users evaluate generated captions rather than actively integrating them into their writing. This paper addresses this gap by investigating how paper authors incorporate AI-generated captions into their writing process through a user study involving 18 participants. Each participant rewrote captions for two figures from their own recently published work, using captions generated by state-of-the-art AI models as a resource. By analyzing video recordings of the writing process through interaction analysis, we observed that participants often began by copying and refining AI-generated captions. Paper writers favored longer, detail-rich captions that integrated textual and visual elements but found current AI models less effective for complex figures. These findings highlight the nuanced and diverse nature of figure caption composition, revealing design opportunities for AI systems to better support the challenges of academic writing.
Related papers
- How Does the Disclosure of AI Assistance Affect the Perceptions of Writing? [29.068596156140913]
We study whether and how the disclosure of the level and type of AI assistance in the writing process would affect people's perceptions of the writing.
Our results suggest that disclosing the AI assistance in the writing process, especially if AI has provided assistance in generating new content, decreases the average quality ratings.
arXiv Detail & Related papers (2024-10-06T16:45:33Z) - Towards Retrieval-Augmented Architectures for Image Captioning [81.11529834508424]
This work presents a novel approach towards developing image captioning models that utilize an external kNN memory to improve the generation process.
Specifically, we propose two model variants that incorporate a knowledge retriever component that is based on visual similarities.
We experimentally validate our approach on COCO and nocaps datasets and demonstrate that incorporating an explicit external memory can significantly enhance the quality of captions.
arXiv Detail & Related papers (2024-05-21T18:02:07Z) - Learning text-to-video retrieval from image captioning [59.81537951811595]
We describe a protocol to study text-to-video retrieval training with unlabeled videos.
We assume (i) no access to labels for any videos, and (ii) access to labeled images in the form of text.
We show that automatically labeling video frames with image captioning allows text-to-video retrieval training.
arXiv Detail & Related papers (2024-04-26T15:56:08Z) - Purposeful remixing with generative AI: Constructing designer voice in multimodal composing [16.24460569356749]
This study investigates whether the use of generative AI tools could help student authors construct a more consistent voice in multimodal writing.
The study sheds light on the intentional and discursive nature of multimodal writing with AI as afforded by the technological flexibility.
arXiv Detail & Related papers (2024-03-28T02:15:03Z) - SciCapenter: Supporting Caption Composition for Scientific Figures with Machine-Generated Captions and Ratings [28.973082312034343]
This paper introduces SciCapenter, an interactive system that puts together cutting-edge AI technologies for scientific figure captions.
SciCapenter generates a variety of captions for each figure in a scholarly article, providing scores and a comprehensive checklist to assess caption quality.
A user study with Ph.D. students indicates that SciCapenter significantly lowers the cognitive load of caption writing.
arXiv Detail & Related papers (2024-03-26T15:16:14Z) - Perceptions and Detection of AI Use in Manuscript Preparation for
Academic Journals [1.881901067333374]
Large Language Models (LLMs) have produced both excitement and worry about how AI will impact academic writing.
Authors of academic publications may decide to voluntarily disclose any AI tools they use to revise their manuscripts.
journals and conferences could begin mandating disclosure and/or turn to using detection services.
arXiv Detail & Related papers (2023-11-19T06:04:46Z) - Summaries as Captions: Generating Figure Captions for Scientific
Documents with Automated Text Summarization [31.619379039184263]
Figure caption generation can be more effectively tackled as a text summarization task in scientific documents.
We fine-tuned PEG, a pre-trained abstractive summarization model, to specifically summarize figure-referencing paragraphs.
Experiments on large-scale arXiv figures show that our method outperforms prior vision methods in both automatic and human evaluations.
arXiv Detail & Related papers (2023-02-23T20:39:06Z) - From Show to Tell: A Survey on Image Captioning [48.98681267347662]
Connecting Vision and Language plays an essential role in Generative Intelligence.
Research in image captioning has not reached a conclusive answer yet.
This work aims at providing a comprehensive overview and categorization of image captioning approaches.
arXiv Detail & Related papers (2021-07-14T18:00:54Z) - Improving Image Captioning with Better Use of Captions [65.39641077768488]
We present a novel image captioning architecture to better explore semantics available in captions and leverage that to enhance both image representation and caption generation.
Our models first construct caption-guided visual relationship graphs that introduce beneficial inductive bias using weakly supervised multi-instance learning.
During generation, the model further incorporates visual relationships using multi-task learning for jointly predicting word and object/predicate tag sequences.
arXiv Detail & Related papers (2020-06-21T14:10:47Z) - Egoshots, an ego-vision life-logging dataset and semantic fidelity
metric to evaluate diversity in image captioning models [63.11766263832545]
We present a new image captioning dataset, Egoshots, consisting of 978 real life images with no captions.
In order to evaluate the quality of the generated captions, we propose a new image captioning metric, object based Semantic Fidelity (SF)
arXiv Detail & Related papers (2020-03-26T04:43:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.