Related papers: Tutorial Recommendation for Livestream Videos using Discourse-Level Consistency and Ontology-Based Filtering

Tutorial Recommendation for Livestream Videos using Discourse-Level Consistency and Ontology-Based Filtering

URL: http://arxiv.org/abs/2209.04953v1
Date: Sun, 11 Sep 2022 22:45:57 GMT
Title: Tutorial Recommendation for Livestream Videos using Discourse-Level Consistency and Ontology-Based Filtering
Authors: Amir Pouran Ben Veyseh, Franck Dernoncourt, Thien Huu Nguyen
Abstract summary: We present a novel dataset and model for the task of tutorial recommendation for live-streamed videos. A system can analyze the content of the live streaming video and recommend the most relevant tutorials.
Score: 75.78484403289228
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Streaming videos is one of the methods for creators to share their creative works with their audience. In these videos, the streamer share how they achieve their final objective by using various tools in one or several programs for creative projects. To this end, the steps required to achieve the final goal can be discussed. As such, these videos could provide substantial educational content that can be used to learn how to employ the tools used by the streamer. However, one of the drawbacks is that the streamer might not provide enough details for every step. Therefore, for the learners, it might be difficult to catch up with all the steps. In order to alleviate this issue, one solution is to link the streaming videos with the relevant tutorial available for the tools used in the streaming video. More specifically, a system can analyze the content of the live streaming video and recommend the most relevant tutorials. Since the existing document recommendation models cannot handle this situation, in this work, we present a novel dataset and model for the task of tutorial recommendation for live-streamed videos. We conduct extensive analyses on the proposed dataset and models, revealing the challenging nature of this task.

Related papers

VideoMix: Aggregating How-To Videos for Task-Oriented Learning [36.183779096566276]
VideoMix is a system that helps users gain a holistic understanding of a how-to task by aggregating information from multiple videos on the task. Powered by a Vision-Language Model pipeline, VideoMix extracts and organizes this information, presenting concise textual summaries alongside relevant video clips.
arXiv Detail & Related papers (2025-03-27T03:43:02Z)
Detours for Navigating Instructional Videos [58.1645668396789]
We propose VidDetours, a video-language approach that learns to retrieve the targeted temporal segments from a large repository of how-to's. We show our model's significant improvements over best available methods for video retrieval and question answering, with recall rates exceeding the state of the art by 35%.
arXiv Detail & Related papers (2024-01-03T16:38:56Z)
VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models [96.55004961251889]
Video Instruction Diffusion (VIDiff) is a unified foundation model designed for a wide range of video tasks. Our model can edit and translate the desired results within seconds based on user instructions. We provide convincing generative results for diverse input videos and written instructions, both qualitatively and quantitatively.
arXiv Detail & Related papers (2023-11-30T18:59:52Z)
TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency [133.75876535332003]
We focus on summarizing instructional videos, an under-explored area of video summarization. Existing video summarization datasets rely on manual frame-level annotations. We propose an instructional video summarization network that combines a context-aware temporal video encoder and a segment scoring transformer.
arXiv Detail & Related papers (2022-08-14T04:07:40Z)
Self-Supervised Learning for Videos: A Survey [70.37277191524755]
Self-supervised learning has shown promise in both image and video domains. In this survey, we provide a review of existing approaches on self-supervised learning focusing on the video domain.
arXiv Detail & Related papers (2022-06-18T00:26:52Z)
Highlight Timestamp Detection Model for Comedy Videos via Multimodal Sentiment Analysis [1.6181085766811525]
We propose a multimodal structure to obtain state-of-the-art performance in this field. We select several benchmarks for multimodal video understanding and apply the most suitable model to find the best performance.
arXiv Detail & Related papers (2021-05-28T08:39:19Z)
The complementarity of a diverse range of deep learning features extracted from video content for video recommendation [2.092922495279074]
We explore the potential of various deep learning features to provide video recommendations. Experiments on a real-world video dataset for movie recommendations show that deep learning features outperform hand-crafted features. In particular, recommendations generated with deep learning audio features and action-centric deep learning features are superior to MFCC and state-of-the-art iDT features.
arXiv Detail & Related papers (2020-11-21T18:00:28Z)
Non-Adversarial Video Synthesis with Learned Priors [53.26777815740381]
We focus on the problem of generating videos from latent noise vectors, without any reference input frames. We develop a novel approach that jointly optimize the input latent space, the weights of a recurrent neural network and a generator through non-adversarial learning. Our approach generates superior quality videos compared to the existing state-of-the-art methods.
arXiv Detail & Related papers (2020-03-21T02:57:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.