VLEngagement: A Dataset of Scientific Video Lectures for Evaluating
Population-based Engagement
- URL: http://arxiv.org/abs/2011.02273v1
- Date: Mon, 2 Nov 2020 14:20:19 GMT
- Title: VLEngagement: A Dataset of Scientific Video Lectures for Evaluating
Population-based Engagement
- Authors: Sahan Bulathwela and Maria Perez-Ortiz and Emine Yilmaz and John
Shawe-Taylor
- Abstract summary: Video lectures have become one of the primary modalities to impart knowledge to masses in the current digital age.
There is still an important need for data and research aimed at understanding learner engagement with scientific video lectures.
This paper introduces VLEngagement, a novel dataset that consists of content-based and video-specific features extracted from publicly available scientific video lectures.
- Score: 23.078055803229912
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the emergence of e-learning and personalised education, the production
and distribution of digital educational resources have boomed. Video lectures
have now become one of the primary modalities to impart knowledge to masses in
the current digital age. The rapid creation of video lecture content challenges
the currently established human-centred moderation and quality assurance
pipeline, demanding for more efficient, scalable and automatic solutions for
managing learning resources. Although a few datasets related to engagement with
educational videos exist, there is still an important need for data and
research aimed at understanding learner engagement with scientific video
lectures. This paper introduces VLEngagement, a novel dataset that consists of
content-based and video-specific features extracted from publicly available
scientific video lectures and several metrics related to user engagement. We
introduce several novel tasks related to predicting and understanding
context-agnostic engagement in video lectures, providing preliminary baselines.
This is the largest and most diverse publicly available dataset to our
knowledge that deals with such tasks. The extraction of Wikipedia topic-based
features also allows associating more sophisticated Wikipedia based features to
the dataset to improve the performance in these tasks. The dataset, helper
tools and example code snippets are available publicly at
https://github.com/sahanbull/context-agnostic-engagement
Related papers
- ViLCo-Bench: VIdeo Language COntinual learning Benchmark [8.660555226687098]
We present ViLCo-Bench, designed to evaluate continual learning models across a range of video-text tasks.
The dataset comprises ten-minute-long videos and corresponding language queries collected from publicly available datasets.
We introduce a novel memory-efficient framework that incorporates self-supervised learning and mimics long-term and short-term memory effects.
arXiv Detail & Related papers (2024-06-19T00:38:19Z) - CinePile: A Long Video Question Answering Dataset and Benchmark [55.30860239555001]
We present a novel dataset and benchmark, CinePile, specifically designed for authentic long-form video understanding.
Our comprehensive dataset comprises 305,000 multiple-choice questions (MCQs), covering various visual and multimodal aspects.
We fine-tuned open-source Video-LLMs on the training split and evaluated both open-source and proprietary video-centric LLMs on the test split of our dataset.
arXiv Detail & Related papers (2024-05-14T17:59:02Z) - InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding
and Generation [90.71796406228265]
InternVid is a large-scale video-centric multimodal dataset that enables learning powerful and transferable video-text representations.
The InternVid dataset contains over 7 million videos lasting nearly 760K hours, yielding 234M video clips accompanied by detailed descriptions of total 4.1B words.
arXiv Detail & Related papers (2023-07-13T17:58:32Z) - A Unified Model for Video Understanding and Knowledge Embedding with
Heterogeneous Knowledge Graph Dataset [47.805378137676605]
We propose a heterogeneous dataset that contains the multi-modal video entity and fruitful common sense relations.
Experiments indicate that combining video understanding embedding with factual knowledge benefits the content-based video retrieval performance.
It also helps the model generate better knowledge graph embedding which outperforms traditional KGE-based methods on VRT and VRV tasks.
arXiv Detail & Related papers (2022-11-19T09:00:45Z) - PEEK: A Large Dataset of Learner Engagement with Educational Videos [20.49299110732228]
We release a large, novel dataset of learners engaging with educational videos in-the-wild.
The dataset, named Personalised Educational Engagement with Knowledge Topics PEEK, is the first publicly available dataset of this nature.
We believe that granular learner engagement signals in unison with rich content representations will pave the way to building powerful personalization algorithms.
arXiv Detail & Related papers (2021-09-03T11:23:02Z) - A Survey on Deep Learning Technique for Video Segmentation [147.0767454918527]
Video segmentation plays a critical role in a broad range of practical applications.
Deep learning based approaches have been dedicated to video segmentation and delivered compelling performance.
arXiv Detail & Related papers (2021-07-02T15:51:07Z) - VALUE: A Multi-Task Benchmark for Video-and-Language Understanding
Evaluation [124.02278735049235]
VALUE benchmark aims to cover a broad range of video genres, video lengths, data volumes, and task difficulty levels.
We evaluate various baseline methods with and without large-scale VidL pre-training.
The significant gap between our best model and human performance calls for future study for advanced VidL models.
arXiv Detail & Related papers (2021-06-08T18:34:21Z) - Classification of Important Segments in Educational Videos using
Multimodal Features [10.175871202841346]
We propose a multimodal neural architecture that utilizes state-of-the-art audio, visual and textual features.
Our experiments investigate the impact of visual and temporal information, as well as the combination of multimodal features on importance prediction.
arXiv Detail & Related papers (2020-10-26T14:40:23Z) - Comprehensive Instructional Video Analysis: The COIN Dataset and
Performance Evaluation [100.68317848808327]
We present a large-scale dataset named as "COIN" for COmprehensive INstructional video analysis.
COIN dataset contains 11,827 videos of 180 tasks in 12 domains related to our daily life.
With a new developed toolbox, all the videos are annotated efficiently with a series of step labels and the corresponding temporal boundaries.
arXiv Detail & Related papers (2020-03-20T16:59:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.