Related papers: Video Summarization Overview

Video Summarization Overview

URL: http://arxiv.org/abs/2210.11707v1
Date: Fri, 21 Oct 2022 03:29:31 GMT
Title: Video Summarization Overview
Authors: Mayu Otani and Yale Song and Yang Wang
Abstract summary: Video summarization facilitates quickly grasping video content by creating a compact summary of videos. This survey covers early studies as well as recent approaches which take advantage of deep learning techniques.
Score: 25.465707307283434
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the broad growth of video capturing devices and applications on the web, it is more demanding to provide desired video content for users efficiently. Video summarization facilitates quickly grasping video content by creating a compact summary of videos. Much effort has been devoted to automatic video summarization, and various problem settings and approaches have been proposed. Our goal is to provide an overview of this field. This survey covers early studies as well as recent approaches which take advantage of deep learning techniques. We describe video summarization approaches and their underlying concepts. We also discuss benchmarks and evaluations. We overview how prior work addressed evaluation and detail the pros and cons of the evaluation protocols. Last but not least, we discuss open challenges in this field.

Related papers

Conditional Modeling Based Automatic Video Summarization [70.96973928590958]
The aim of video summarization is to shorten videos automatically while retaining the key information necessary to convey the overall story. Video summarization methods rely on visual factors, such as visual consecutiveness and diversity, which may not be sufficient to fully understand the content of the video. A new approach to video summarization is proposed based on insights gained from how humans create ground truth video summaries.
arXiv Detail & Related papers (2023-11-20T20:24:45Z)
Causal Video Summarizer for Video Exploration [74.27487067877047]
Causal Video Summarizer (CVS) is proposed to capture the interactive information between the video and query. Based on the evaluation of the existing multi-modal video summarization dataset, experimental results show that the proposed approach is effective.
arXiv Detail & Related papers (2023-07-04T22:52:16Z)
Learning to Summarize Videos by Contrasting Clips [1.3999481573773074]
Video summarization aims at choosing parts of a video that narrate a story as close as possible to the original one. Most of the existing video summarization approaches focus on hand-crafted labels. We propose contrastive learning as the answer to both questions.
arXiv Detail & Related papers (2023-01-12T18:55:30Z)
TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency [133.75876535332003]
We focus on summarizing instructional videos, an under-explored area of video summarization. Existing video summarization datasets rely on manual frame-level annotations. We propose an instructional video summarization network that combines a context-aware temporal video encoder and a segment scoring transformer.
arXiv Detail & Related papers (2022-08-14T04:07:40Z)
A Survey on Deep Learning Technique for Video Segmentation [147.0767454918527]
Video segmentation plays a critical role in a broad range of practical applications. Deep learning based approaches have been dedicated to video segmentation and delivered compelling performance.
arXiv Detail & Related papers (2021-07-02T15:51:07Z)
Video Summarization Using Deep Neural Networks: A Survey [72.98424352264904]
Video summarization technologies aim to create a concise and complete synopsis by selecting the most informative parts of the video content. This work focuses on the recent advances in the area and provides a comprehensive survey of the existing deep-learning-based methods for generic video summarization.
arXiv Detail & Related papers (2021-01-15T11:41:29Z)
A Comprehensive Review on Recent Methods and Challenges of Video Description [11.69687792533269]
Video description involves the generation of the natural language description of actions, events, and objects in the video. There are various applications of video description by filling the gap between languages and vision for visually impaired people. In the past decade, several works had been done in this field in terms of approaches/methods for video description, evaluation metrics, and datasets.
arXiv Detail & Related papers (2020-11-30T13:08:45Z)
Query-controllable Video Summarization [16.54586273670312]
We introduce a method which takes a text-based query as input and generates a video summary corresponding to it. Our proposed method consists of a video summary controller, video summary generator, and video summary output module.
arXiv Detail & Related papers (2020-04-07T19:35:04Z)
Convolutional Hierarchical Attention Network for Query-Focused Video Summarization [74.48782934264094]
This paper addresses the task of query-focused video summarization, which takes user's query and a long video as inputs. We propose a method, named Convolutional Hierarchical Attention Network (CHAN), which consists of two parts: feature encoding network and query-relevance computing module. In the encoding network, we employ a convolutional network with local self-attention mechanism and query-aware global attention mechanism to learns visual information of each shot.
arXiv Detail & Related papers (2020-01-31T04:30:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.