PEEK: A Large Dataset of Learner Engagement with Educational Videos
- URL: http://arxiv.org/abs/2109.03154v1
- Date: Fri, 3 Sep 2021 11:23:02 GMT
- Title: PEEK: A Large Dataset of Learner Engagement with Educational Videos
- Authors: Sahan Bulathwela, Maria Perez-Ortiz, Erik Novak, Emine Yilmaz, John
Shawe-Taylor
- Abstract summary: We release a large, novel dataset of learners engaging with educational videos in-the-wild.
The dataset, named Personalised Educational Engagement with Knowledge Topics PEEK, is the first publicly available dataset of this nature.
We believe that granular learner engagement signals in unison with rich content representations will pave the way to building powerful personalization algorithms.
- Score: 20.49299110732228
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Educational recommenders have received much less attention in comparison to
e-commerce and entertainment-related recommenders, even though efficient
intelligent tutors have great potential to improve learning gains. One of the
main challenges in advancing this research direction is the scarcity of large,
publicly available datasets. In this work, we release a large, novel dataset of
learners engaging with educational videos in-the-wild. The dataset, named
Personalised Educational Engagement with Knowledge Topics PEEK, is the first
publicly available dataset of this nature. The video lectures have been
associated with Wikipedia concepts related to the material of the lecture, thus
providing a humanly intuitive taxonomy. We believe that granular learner
engagement signals in unison with rich content representations will pave the
way to building powerful personalization algorithms that will revolutionise
educational and informational recommendation systems. Towards this goal, we 1)
construct a novel dataset from a popular video lecture repository, 2) identify
a set of benchmark algorithms to model engagement, and 3) run extensive
experimentation on the PEEK dataset to demonstrate its value. Our experiments
with the dataset show promise in building powerful informational recommender
systems. The dataset and the support code is available publicly.
Related papers
- VideoWorld: Exploring Knowledge Learning from Unlabeled Videos [119.35107657321902]
This work explores whether a deep generative model can learn complex knowledge solely from visual input.
We develop VideoWorld, an auto-regressive video generation model trained on unlabeled video data, and test its knowledge acquisition abilities in video-based Go and robotic control tasks.
arXiv Detail & Related papers (2025-01-16T18:59:10Z) - T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs [102.66246727371583]
We develop a method called T2Vid to synthesize video-like samples to enrich the instruction diversity in the training corpus.
We find that the proposed scheme can boost the performance of long video understanding without training with long video samples.
arXiv Detail & Related papers (2024-11-29T18:59:54Z) - A Toolbox for Modelling Engagement with Educational Videos [21.639063299289607]
This work presents the PEEKC dataset and the TrueLearn Python library, which contains a dataset and a series of online learner state models.
The dataset contains a large amount of AI-related educational videos, which are of interest for building and validating AI-specific educational recommenders.
arXiv Detail & Related papers (2023-12-30T21:10:55Z) - InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding
and Generation [90.71796406228265]
InternVid is a large-scale video-centric multimodal dataset that enables learning powerful and transferable video-text representations.
The InternVid dataset contains over 7 million videos lasting nearly 760K hours, yielding 234M video clips accompanied by detailed descriptions of total 4.1B words.
arXiv Detail & Related papers (2023-07-13T17:58:32Z) - Can Population-based Engagement Improve Personalisation? A Novel Dataset
and Experiments [21.12546768556595]
VLE is a novel dataset that consists of content and video based features extracted from publicly available scientific video lectures.
Our experimental results indicate that the newly proposed VLE dataset leads to building context-agnostic engagement prediction models.
Experiments in combining the built model with a personalising algorithm show promising improvements in addressing the cold-start problem encountered in educational recommenders.
arXiv Detail & Related papers (2022-06-22T15:53:24Z) - Self-Supervised Learning for Videos: A Survey [70.37277191524755]
Self-supervised learning has shown promise in both image and video domains.
In this survey, we provide a review of existing approaches on self-supervised learning focusing on the video domain.
arXiv Detail & Related papers (2022-06-18T00:26:52Z) - NoisyActions2M: A Multimedia Dataset for Video Understanding from Noisy
Labels [33.659146748289444]
We create a benchmark dataset consisting of around 2 million videos with associated user-generated annotations and other meta information.
We show how a network pretrained on the proposed dataset can help against video corruption and label noise in downstream datasets.
arXiv Detail & Related papers (2021-10-13T16:12:18Z) - VLEngagement: A Dataset of Scientific Video Lectures for Evaluating
Population-based Engagement [23.078055803229912]
Video lectures have become one of the primary modalities to impart knowledge to masses in the current digital age.
There is still an important need for data and research aimed at understanding learner engagement with scientific video lectures.
This paper introduces VLEngagement, a novel dataset that consists of content-based and video-specific features extracted from publicly available scientific video lectures.
arXiv Detail & Related papers (2020-11-02T14:20:19Z) - Attentional Graph Convolutional Networks for Knowledge Concept
Recommendation in MOOCs in a Heterogeneous View [72.98388321383989]
Massive open online courses ( MOOCs) provide a large-scale and open-access learning opportunity for students to grasp the knowledge.
To attract students' interest, the recommendation system is applied by MOOCs providers to recommend courses to students.
We propose an end-to-end graph neural network-based approach calledAttentionalHeterogeneous Graph Convolutional Deep Knowledge Recommender(ACKRec) for knowledge concept recommendation in MOOCs.
arXiv Detail & Related papers (2020-06-23T18:28:08Z) - Comprehensive Instructional Video Analysis: The COIN Dataset and
Performance Evaluation [100.68317848808327]
We present a large-scale dataset named as "COIN" for COmprehensive INstructional video analysis.
COIN dataset contains 11,827 videos of 180 tasks in 12 domains related to our daily life.
With a new developed toolbox, all the videos are annotated efficiently with a series of step labels and the corresponding temporal boundaries.
arXiv Detail & Related papers (2020-03-20T16:59:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.