Related papers: PEEK: A Large Dataset of Learner Engagement with Educational Videos

PEEK: A Large Dataset of Learner Engagement with Educational Videos

URL: http://arxiv.org/abs/2109.03154v1
Date: Fri, 3 Sep 2021 11:23:02 GMT
Title: PEEK: A Large Dataset of Learner Engagement with Educational Videos
Authors: Sahan Bulathwela, Maria Perez-Ortiz, Erik Novak, Emine Yilmaz, John Shawe-Taylor
Abstract summary: We release a large, novel dataset of learners engaging with educational videos in-the-wild. The dataset, named Personalised Educational Engagement with Knowledge Topics PEEK, is the first publicly available dataset of this nature. We believe that granular learner engagement signals in unison with rich content representations will pave the way to building powerful personalization algorithms.
Score: 20.49299110732228
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Educational recommenders have received much less attention in comparison to e-commerce and entertainment-related recommenders, even though efficient intelligent tutors have great potential to improve learning gains. One of the main challenges in advancing this research direction is the scarcity of large, publicly available datasets. In this work, we release a large, novel dataset of learners engaging with educational videos in-the-wild. The dataset, named Personalised Educational Engagement with Knowledge Topics PEEK, is the first publicly available dataset of this nature. The video lectures have been associated with Wikipedia concepts related to the material of the lecture, thus providing a humanly intuitive taxonomy. We believe that granular learner engagement signals in unison with rich content representations will pave the way to building powerful personalization algorithms that will revolutionise educational and informational recommendation systems. Towards this goal, we 1) construct a novel dataset from a popular video lecture repository, 2) identify a set of benchmark algorithms to model engagement, and 3) run extensive experimentation on the PEEK dataset to demonstrate its value. Our experiments with the dataset show promise in building powerful informational recommender systems. The dataset and the support code is available publicly.

Related papers

VideoWorld: Exploring Knowledge Learning from Unlabeled Videos [119.35107657321902]
This work explores whether a deep generative model can learn complex knowledge solely from visual input. We develop VideoWorld, an auto-regressive video generation model trained on unlabeled video data, and test its knowledge acquisition abilities in video-based Go and robotic control tasks.
arXiv Detail & Related papers (2025-01-16T18:59:10Z)
Language-Model-Assisted Bi-Level Programming for Reward Learning from Internet Videos [48.2044649011213]
We introduce a language-model-assisted bi-level programming framework that enables a reinforcement learning agent to learn its reward from internet videos. The framework includes two levels: an upper level where a vision-language model (VLM) provides feedback by comparing the learner's behavior with expert videos, and a lower level where a large language model (LLM) translates this feedback into reward updates. We validate the method for reward learning from YouTube videos, and the results have shown that the proposed method enables efficient reward design from expert videos of biological agents.
arXiv Detail & Related papers (2024-10-11T22:31:39Z)
A Toolbox for Modelling Engagement with Educational Videos [21.639063299289607]
This work presents the PEEKC dataset and the TrueLearn Python library, which contains a dataset and a series of online learner state models. The dataset contains a large amount of AI-related educational videos, which are of interest for building and validating AI-specific educational recommenders.
arXiv Detail & Related papers (2023-12-30T21:10:55Z)
InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation [90.71796406228265]
InternVid is a large-scale video-centric multimodal dataset that enables learning powerful and transferable video-text representations. The InternVid dataset contains over 7 million videos lasting nearly 760K hours, yielding 234M video clips accompanied by detailed descriptions of total 4.1B words.
arXiv Detail & Related papers (2023-07-13T17:58:32Z)
A Unified Model for Video Understanding and Knowledge Embedding with Heterogeneous Knowledge Graph Dataset [47.805378137676605]
We propose a heterogeneous dataset that contains the multi-modal video entity and fruitful common sense relations. Experiments indicate that combining video understanding embedding with factual knowledge benefits the content-based video retrieval performance. It also helps the model generate better knowledge graph embedding which outperforms traditional KGE-based methods on VRT and VRV tasks.
arXiv Detail & Related papers (2022-11-19T09:00:45Z)
Can Population-based Engagement Improve Personalisation? A Novel Dataset and Experiments [21.12546768556595]
VLE is a novel dataset that consists of content and video based features extracted from publicly available scientific video lectures. Our experimental results indicate that the newly proposed VLE dataset leads to building context-agnostic engagement prediction models. Experiments in combining the built model with a personalising algorithm show promising improvements in addressing the cold-start problem encountered in educational recommenders.
arXiv Detail & Related papers (2022-06-22T15:53:24Z)
Self-Supervised Learning for Videos: A Survey [70.37277191524755]
Self-supervised learning has shown promise in both image and video domains. In this survey, we provide a review of existing approaches on self-supervised learning focusing on the video domain.
arXiv Detail & Related papers (2022-06-18T00:26:52Z)
NoisyActions2M: A Multimedia Dataset for Video Understanding from Noisy Labels [33.659146748289444]
We create a benchmark dataset consisting of around 2 million videos with associated user-generated annotations and other meta information. We show how a network pretrained on the proposed dataset can help against video corruption and label noise in downstream datasets.
arXiv Detail & Related papers (2021-10-13T16:12:18Z)
VLEngagement: A Dataset of Scientific Video Lectures for Evaluating Population-based Engagement [23.078055803229912]
Video lectures have become one of the primary modalities to impart knowledge to masses in the current digital age. There is still an important need for data and research aimed at understanding learner engagement with scientific video lectures. This paper introduces VLEngagement, a novel dataset that consists of content-based and video-specific features extracted from publicly available scientific video lectures.
arXiv Detail & Related papers (2020-11-02T14:20:19Z)
Attentional Graph Convolutional Networks for Knowledge Concept Recommendation in MOOCs in a Heterogeneous View [72.98388321383989]
Massive open online courses ( MOOCs) provide a large-scale and open-access learning opportunity for students to grasp the knowledge. To attract students' interest, the recommendation system is applied by MOOCs providers to recommend courses to students. We propose an end-to-end graph neural network-based approach calledAttentionalHeterogeneous Graph Convolutional Deep Knowledge Recommender(ACKRec) for knowledge concept recommendation in MOOCs.
arXiv Detail & Related papers (2020-06-23T18:28:08Z)
Comprehensive Instructional Video Analysis: The COIN Dataset and Performance Evaluation [100.68317848808327]
We present a large-scale dataset named as "COIN" for COmprehensive INstructional video analysis. COIN dataset contains 11,827 videos of 180 tasks in 12 domains related to our daily life. With a new developed toolbox, all the videos are annotated efficiently with a series of step labels and the corresponding temporal boundaries.
arXiv Detail & Related papers (2020-03-20T16:59:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.