Tutorial Recommendation for Livestream Videos using Discourse-Level
Consistency and Ontology-Based Filtering
- URL: http://arxiv.org/abs/2209.04953v1
- Date: Sun, 11 Sep 2022 22:45:57 GMT
- Title: Tutorial Recommendation for Livestream Videos using Discourse-Level
Consistency and Ontology-Based Filtering
- Authors: Amir Pouran Ben Veyseh, Franck Dernoncourt, Thien Huu Nguyen
- Abstract summary: We present a novel dataset and model for the task of tutorial recommendation for live-streamed videos.
A system can analyze the content of the live streaming video and recommend the most relevant tutorials.
- Score: 75.78484403289228
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Streaming videos is one of the methods for creators to share their creative
works with their audience. In these videos, the streamer share how they achieve
their final objective by using various tools in one or several programs for
creative projects. To this end, the steps required to achieve the final goal
can be discussed. As such, these videos could provide substantial educational
content that can be used to learn how to employ the tools used by the streamer.
However, one of the drawbacks is that the streamer might not provide enough
details for every step. Therefore, for the learners, it might be difficult to
catch up with all the steps. In order to alleviate this issue, one solution is
to link the streaming videos with the relevant tutorial available for the tools
used in the streaming video. More specifically, a system can analyze the
content of the live streaming video and recommend the most relevant tutorials.
Since the existing document recommendation models cannot handle this situation,
in this work, we present a novel dataset and model for the task of tutorial
recommendation for live-streamed videos. We conduct extensive analyses on the
proposed dataset and models, revealing the challenging nature of this task.
Related papers
- Detours for Navigating Instructional Videos [58.1645668396789]
We propose VidDetours, a video-language approach that learns to retrieve the targeted temporal segments from a large repository of how-to's.
We show our model's significant improvements over best available methods for video retrieval and question answering, with recall rates exceeding the state of the art by 35%.
arXiv Detail & Related papers (2024-01-03T16:38:56Z) - VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion
Models [96.55004961251889]
Video Instruction Diffusion (VIDiff) is a unified foundation model designed for a wide range of video tasks.
Our model can edit and translate the desired results within seconds based on user instructions.
We provide convincing generative results for diverse input videos and written instructions, both qualitatively and quantitatively.
arXiv Detail & Related papers (2023-11-30T18:59:52Z) - TL;DW? Summarizing Instructional Videos with Task Relevance &
Cross-Modal Saliency [133.75876535332003]
We focus on summarizing instructional videos, an under-explored area of video summarization.
Existing video summarization datasets rely on manual frame-level annotations.
We propose an instructional video summarization network that combines a context-aware temporal video encoder and a segment scoring transformer.
arXiv Detail & Related papers (2022-08-14T04:07:40Z) - Self-Supervised Learning for Videos: A Survey [70.37277191524755]
Self-supervised learning has shown promise in both image and video domains.
In this survey, we provide a review of existing approaches on self-supervised learning focusing on the video domain.
arXiv Detail & Related papers (2022-06-18T00:26:52Z) - Highlight Timestamp Detection Model for Comedy Videos via Multimodal
Sentiment Analysis [1.6181085766811525]
We propose a multimodal structure to obtain state-of-the-art performance in this field.
We select several benchmarks for multimodal video understanding and apply the most suitable model to find the best performance.
arXiv Detail & Related papers (2021-05-28T08:39:19Z) - The complementarity of a diverse range of deep learning features
extracted from video content for video recommendation [2.092922495279074]
We explore the potential of various deep learning features to provide video recommendations.
Experiments on a real-world video dataset for movie recommendations show that deep learning features outperform hand-crafted features.
In particular, recommendations generated with deep learning audio features and action-centric deep learning features are superior to MFCC and state-of-the-art iDT features.
arXiv Detail & Related papers (2020-11-21T18:00:28Z) - Non-Adversarial Video Synthesis with Learned Priors [53.26777815740381]
We focus on the problem of generating videos from latent noise vectors, without any reference input frames.
We develop a novel approach that jointly optimize the input latent space, the weights of a recurrent neural network and a generator through non-adversarial learning.
Our approach generates superior quality videos compared to the existing state-of-the-art methods.
arXiv Detail & Related papers (2020-03-21T02:57:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.