PEEK: A Large Dataset of Learner Engagement with Educational Videos
- URL: http://arxiv.org/abs/2109.03154v1
- Date: Fri, 3 Sep 2021 11:23:02 GMT
- Title: PEEK: A Large Dataset of Learner Engagement with Educational Videos
- Authors: Sahan Bulathwela, Maria Perez-Ortiz, Erik Novak, Emine Yilmaz, John
Shawe-Taylor
- Abstract summary: We release a large, novel dataset of learners engaging with educational videos in-the-wild.
The dataset, named Personalised Educational Engagement with Knowledge Topics PEEK, is the first publicly available dataset of this nature.
We believe that granular learner engagement signals in unison with rich content representations will pave the way to building powerful personalization algorithms.
- Score: 20.49299110732228
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Educational recommenders have received much less attention in comparison to
e-commerce and entertainment-related recommenders, even though efficient
intelligent tutors have great potential to improve learning gains. One of the
main challenges in advancing this research direction is the scarcity of large,
publicly available datasets. In this work, we release a large, novel dataset of
learners engaging with educational videos in-the-wild. The dataset, named
Personalised Educational Engagement with Knowledge Topics PEEK, is the first
publicly available dataset of this nature. The video lectures have been
associated with Wikipedia concepts related to the material of the lecture, thus
providing a humanly intuitive taxonomy. We believe that granular learner
engagement signals in unison with rich content representations will pave the
way to building powerful personalization algorithms that will revolutionise
educational and informational recommendation systems. Towards this goal, we 1)
construct a novel dataset from a popular video lecture repository, 2) identify
a set of benchmark algorithms to model engagement, and 3) run extensive
experimentation on the PEEK dataset to demonstrate its value. Our experiments
with the dataset show promise in building powerful informational recommender
systems. The dataset and the support code is available publicly.
Related papers
- Language-Model-Assisted Bi-Level Programming for Reward Learning from Internet Videos [48.2044649011213]
We introduce a language-model-assisted bi-level programming framework that enables a reinforcement learning agent to learn its reward from internet videos.
The framework includes two levels: an upper level where a vision-language model (VLM) provides feedback by comparing the learner's behavior with expert videos, and a lower level where a large language model (LLM) translates this feedback into reward updates.
We validate the method for reward learning from YouTube videos, and the results have shown that the proposed method enables efficient reward design from expert videos of biological agents.
arXiv Detail & Related papers (2024-10-11T22:31:39Z) - A Toolbox for Modelling Engagement with Educational Videos [21.639063299289607]
This work presents the PEEKC dataset and the TrueLearn Python library, which contains a dataset and a series of online learner state models.
The dataset contains a large amount of AI-related educational videos, which are of interest for building and validating AI-specific educational recommenders.
arXiv Detail & Related papers (2023-12-30T21:10:55Z) - InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding
and Generation [90.71796406228265]
InternVid is a large-scale video-centric multimodal dataset that enables learning powerful and transferable video-text representations.
The InternVid dataset contains over 7 million videos lasting nearly 760K hours, yielding 234M video clips accompanied by detailed descriptions of total 4.1B words.
arXiv Detail & Related papers (2023-07-13T17:58:32Z) - A Unified Model for Video Understanding and Knowledge Embedding with
Heterogeneous Knowledge Graph Dataset [47.805378137676605]
We propose a heterogeneous dataset that contains the multi-modal video entity and fruitful common sense relations.
Experiments indicate that combining video understanding embedding with factual knowledge benefits the content-based video retrieval performance.
It also helps the model generate better knowledge graph embedding which outperforms traditional KGE-based methods on VRT and VRV tasks.
arXiv Detail & Related papers (2022-11-19T09:00:45Z) - Can Population-based Engagement Improve Personalisation? A Novel Dataset
and Experiments [21.12546768556595]
VLE is a novel dataset that consists of content and video based features extracted from publicly available scientific video lectures.
Our experimental results indicate that the newly proposed VLE dataset leads to building context-agnostic engagement prediction models.
Experiments in combining the built model with a personalising algorithm show promising improvements in addressing the cold-start problem encountered in educational recommenders.
arXiv Detail & Related papers (2022-06-22T15:53:24Z) - Self-Supervised Learning for Videos: A Survey [70.37277191524755]
Self-supervised learning has shown promise in both image and video domains.
In this survey, we provide a review of existing approaches on self-supervised learning focusing on the video domain.
arXiv Detail & Related papers (2022-06-18T00:26:52Z) - NoisyActions2M: A Multimedia Dataset for Video Understanding from Noisy
Labels [33.659146748289444]
We create a benchmark dataset consisting of around 2 million videos with associated user-generated annotations and other meta information.
We show how a network pretrained on the proposed dataset can help against video corruption and label noise in downstream datasets.
arXiv Detail & Related papers (2021-10-13T16:12:18Z) - VLEngagement: A Dataset of Scientific Video Lectures for Evaluating
Population-based Engagement [23.078055803229912]
Video lectures have become one of the primary modalities to impart knowledge to masses in the current digital age.
There is still an important need for data and research aimed at understanding learner engagement with scientific video lectures.
This paper introduces VLEngagement, a novel dataset that consists of content-based and video-specific features extracted from publicly available scientific video lectures.
arXiv Detail & Related papers (2020-11-02T14:20:19Z) - Attentional Graph Convolutional Networks for Knowledge Concept
Recommendation in MOOCs in a Heterogeneous View [72.98388321383989]
Massive open online courses ( MOOCs) provide a large-scale and open-access learning opportunity for students to grasp the knowledge.
To attract students' interest, the recommendation system is applied by MOOCs providers to recommend courses to students.
We propose an end-to-end graph neural network-based approach calledAttentionalHeterogeneous Graph Convolutional Deep Knowledge Recommender(ACKRec) for knowledge concept recommendation in MOOCs.
arXiv Detail & Related papers (2020-06-23T18:28:08Z) - Comprehensive Instructional Video Analysis: The COIN Dataset and
Performance Evaluation [100.68317848808327]
We present a large-scale dataset named as "COIN" for COmprehensive INstructional video analysis.
COIN dataset contains 11,827 videos of 180 tasks in 12 domains related to our daily life.
With a new developed toolbox, all the videos are annotated efficiently with a series of step labels and the corresponding temporal boundaries.
arXiv Detail & Related papers (2020-03-20T16:59:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.