A Survey on Deep Learning Technique for Video Segmentation
- URL: http://arxiv.org/abs/2107.01153v1
- Date: Fri, 2 Jul 2021 15:51:07 GMT
- Title: A Survey on Deep Learning Technique for Video Segmentation
- Authors: Wenguan Wang, Tianfei Zhou, Fatih Porikli, David Crandall, Luc Van
Gool
- Abstract summary: Video segmentation plays a critical role in a broad range of practical applications.
Deep learning based approaches have been dedicated to video segmentation and delivered compelling performance.
- Score: 147.0767454918527
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Video segmentation, i.e., partitioning video frames into multiple segments or
objects, plays a critical role in a broad range of practical applications,
e.g., visual effect assistance in movie, scene understanding in autonomous
driving, and virtual background creation in video conferencing, to name a few.
Recently, due to the renaissance of connectionism in computer vision, there has
been an influx of numerous deep learning based approaches that have been
dedicated to video segmentation and delivered compelling performance. In this
survey, we comprehensively review two basic lines of research in this area,
i.e., generic object segmentation (of unknown categories) in videos and video
semantic segmentation, by introducing their respective task settings,
background concepts, perceived need, development history, and main challenges.
We also provide a detailed overview of representative literature on both
methods and datasets. Additionally, we present quantitative performance
comparisons of the reviewed methods on benchmark datasets. At last, we point
out a set of unsolved open issues in this field, and suggest possible
opportunities for further research.
Related papers
- Video Summarization Techniques: A Comprehensive Review [1.6381055567716192]
The paper explores the various approaches and methods created for video summarizing, emphasizing both abstractive and extractive strategies.
The process of extractive summarization involves the identification of key frames or segments from the source video, utilizing methods such as shot boundary recognition, and clustering.
On the other hand, abstractive summarization creates new content by getting the essential content from the video, using machine learning models like deep neural networks and natural language processing, reinforcement learning, attention mechanisms, generative adversarial networks, and multi-modal learning.
arXiv Detail & Related papers (2024-10-06T11:17:54Z) - VISA: Reasoning Video Object Segmentation via Large Language Models [64.33167989521357]
We introduce a new task, Reasoning Video Object (ReasonVOS)
This task aims to generate a sequence of segmentation masks in response to implicit text queries that require complex reasoning abilities.
We introduce VISA (Video-based large language Instructed Assistant) to tackle ReasonVOS.
arXiv Detail & Related papers (2024-07-16T02:29:29Z) - Deep Learning Techniques for Video Instance Segmentation: A Survey [19.32547752428875]
Video instance segmentation is an emerging computer vision research area introduced in 2019.
Deep-learning techniques take a dominant role in various computer vision areas.
This survey offers a multifaceted view of deep-learning schemes for video instance segmentation.
arXiv Detail & Related papers (2023-10-19T00:27:30Z) - Learning Visual Affordance Grounding from Demonstration Videos [76.46484684007706]
Affordance grounding aims to segment all possible interaction regions between people and objects from an image/video.
We propose a Hand-aided Affordance Grounding Network (HAGNet) that leverages the aided clues provided by the position and action of the hand in demonstration videos.
arXiv Detail & Related papers (2021-08-12T11:45:38Z) - Video Summarization Using Deep Neural Networks: A Survey [72.98424352264904]
Video summarization technologies aim to create a concise and complete synopsis by selecting the most informative parts of the video content.
This work focuses on the recent advances in the area and provides a comprehensive survey of the existing deep-learning-based methods for generic video summarization.
arXiv Detail & Related papers (2021-01-15T11:41:29Z) - Incorporating Domain Knowledge To Improve Topic Segmentation Of Long
MOOC Lecture Videos [4.189643331553923]
We propose an algorithm for automatically detecting different coherent topics present inside a long lecture video.
We use the language model on speech-to-text transcription to capture the implicit meaning of the whole video.
We also leverage the domain knowledge we can capture the way instructor binds and connects different concepts while teaching.
arXiv Detail & Related papers (2020-12-08T13:37:40Z) - A Comprehensive Review on Recent Methods and Challenges of Video
Description [11.69687792533269]
Video description involves the generation of the natural language description of actions, events, and objects in the video.
There are various applications of video description by filling the gap between languages and vision for visually impaired people.
In the past decade, several works had been done in this field in terms of approaches/methods for video description, evaluation metrics, and datasets.
arXiv Detail & Related papers (2020-11-30T13:08:45Z) - A Hierarchical Multi-Modal Encoder for Moment Localization in Video
Corpus [31.387948069111893]
We show how to identify a short segment in a long video that semantically matches a text query.
To tackle this problem, we propose the HierArchical Multi-Modal EncodeR (HAMMER) that encodes a video at both the coarse-grained clip level and the fine-trimmed frame level.
We conduct extensive experiments to evaluate our model on moment localization in video corpus on ActivityNet Captions and TVR datasets.
arXiv Detail & Related papers (2020-11-18T02:42:36Z) - Motion-supervised Co-Part Segmentation [88.40393225577088]
We propose a self-supervised deep learning method for co-part segmentation.
Our approach develops the idea that motion information inferred from videos can be leveraged to discover meaningful object parts.
arXiv Detail & Related papers (2020-04-07T09:56:45Z) - Image Segmentation Using Deep Learning: A Survey [58.37211170954998]
Image segmentation is a key topic in image processing and computer vision.
There has been a substantial amount of works aimed at developing image segmentation approaches using deep learning models.
arXiv Detail & Related papers (2020-01-15T21:37:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.