Beyond the Lens: Quantifying the Impact of Scientific Documentaries through Amazon Reviews
- URL: http://arxiv.org/abs/2502.08705v1
- Date: Wed, 12 Feb 2025 19:00:01 GMT
- Title: Beyond the Lens: Quantifying the Impact of Scientific Documentaries through Amazon Reviews
- Authors: Jill Naiman, Aria Pessianzadeh, Hanyu Zhao, AJ Christensen, Alistair Nunn, Shriya Srikanth, Anushka Gami, Emma Maxwell, Louisa Zhang, Sri Nithya Yeragorla, Rezvaneh Rezapour,
- Abstract summary: Engaging the public with science is critical for a well-informed population.
It can be difficult to assess the impact of such works on a large scale, due to the overhead required for in-depth audience feedback studies.
In what follows, we overview our complementary approach to qualitative studies through quantitative impact and sentiment analysis of Amazon reviews for several scientific documentaries.
- Score: 1.809089144827797
- License:
- Abstract: Engaging the public with science is critical for a well-informed population. A popular method of scientific communication is documentaries. Once released, it can be difficult to assess the impact of such works on a large scale, due to the overhead required for in-depth audience feedback studies. In what follows, we overview our complementary approach to qualitative studies through quantitative impact and sentiment analysis of Amazon reviews for several scientific documentaries. In addition to developing a novel impact category taxonomy for this analysis, we release a dataset containing 1296 human-annotated sentences from 1043 Amazon reviews for six movies created in whole or part by a team of visualization designers who focus on cinematic presentations of scientific data. Using this data, we train and evaluate several machine learning and large language models, discussing their effectiveness and possible generalizability for documentaries beyond those focused on for this work. Themes are also extracted from our annotated dataset which, along with our large language model analysis, demonstrate a measure of the ability of scientific documentaries to engage with the public.
Related papers
- What Is That Talk About? A Video-to-Text Summarization Dataset for Scientific Presentations [47.79536652721794]
This paper introduces VISTA, a dataset specifically designed for video-to-text summarization in scientific domains.
We benchmark the performance of state-of-the-art large models and apply a plan-based framework to better capture the structured nature of abstracts.
arXiv Detail & Related papers (2025-02-12T10:36:55Z) - SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature [80.49349719239584]
We present SciRIFF (Scientific Resource for Instruction-Following and Finetuning), a dataset of 137K instruction-following demonstrations for 54 tasks.
SciRIFF is the first dataset focused on extracting and synthesizing information from research literature across a wide range of scientific fields.
arXiv Detail & Related papers (2024-06-10T21:22:08Z) - MASSW: A New Dataset and Benchmark Tasks for AI-Assisted Scientific Workflows [58.56005277371235]
We introduce MASSW, a comprehensive text dataset on Multi-Aspect Summarization of ScientificAspects.
MASSW includes more than 152,000 peer-reviewed publications from 17 leading computer science conferences spanning the past 50 years.
We demonstrate the utility of MASSW through multiple novel machine-learning tasks that can be benchmarked using this new dataset.
arXiv Detail & Related papers (2024-06-10T15:19:09Z) - Automated Focused Feedback Generation for Scientific Writing Assistance [6.559560602099439]
SWIF$2$T: a Scientific WrIting Focused Feedback Tool.
It is designed to generate specific, actionable and coherent comments, which identify weaknesses in a scientific paper and/or propose revisions to it.
We compile a dataset of 300 peer reviews citing weaknesses in scientific papers and conduct human evaluation.
The results demonstrate the superiority in specificity, reading comprehension, and overall helpfulness of SWIF$2$T's feedback compared to other approaches.
arXiv Detail & Related papers (2024-05-30T20:56:41Z) - LFED: A Literary Fiction Evaluation Dataset for Large Language Models [58.85989777743013]
We collect 95 literary fictions that are either originally written in Chinese or translated into Chinese, covering a wide range of topics across several centuries.
We define a question taxonomy with 8 question categories to guide the creation of 1,304 questions.
We conduct an in-depth analysis to ascertain how specific attributes of literary fictions (e.g., novel types, character numbers, the year of publication) impact LLM performance in evaluations.
arXiv Detail & Related papers (2024-05-16T15:02:24Z) - Forecasting high-impact research topics via machine learning on evolving knowledge graphs [0.6906005491572401]
We show how to predict the impact of onsets of ideas that have never been published by researchers.
We developed a large evolving knowledge graph built from more than 21 million scientific papers.
Using machine learning, we can predict the dynamic of the evolving network into the future with high accuracy.
arXiv Detail & Related papers (2024-02-13T18:09:38Z) - Models See Hallucinations: Evaluating the Factuality in Video Captioning [57.85548187177109]
We conduct a human evaluation of the factuality in video captioning and collect two annotated factuality datasets.
We find that 57.0% of the model-generated sentences have factual errors, indicating it is a severe problem in this field.
We propose a weakly-supervised, model-based factuality metric FactVC, which outperforms previous metrics on factuality evaluation of video captioning.
arXiv Detail & Related papers (2023-03-06T08:32:50Z) - Can We Automate Scientific Reviewing? [89.50052670307434]
We discuss the possibility of using state-of-the-art natural language processing (NLP) models to generate first-pass peer reviews for scientific papers.
We collect a dataset of papers in the machine learning domain, annotate them with different aspects of content covered in each review, and train targeted summarization models that take in papers to generate reviews.
Comprehensive experimental results show that system-generated reviews tend to touch upon more aspects of the paper than human-written reviews.
arXiv Detail & Related papers (2021-01-30T07:16:53Z) - VLEngagement: A Dataset of Scientific Video Lectures for Evaluating
Population-based Engagement [23.078055803229912]
Video lectures have become one of the primary modalities to impart knowledge to masses in the current digital age.
There is still an important need for data and research aimed at understanding learner engagement with scientific video lectures.
This paper introduces VLEngagement, a novel dataset that consists of content-based and video-specific features extracted from publicly available scientific video lectures.
arXiv Detail & Related papers (2020-11-02T14:20:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.