Related papers: Multilevel profiling of situation and dialogue-based deep networks for movie genre classification using movie trailers

Multilevel profiling of situation and dialogue-based deep networks for movie genre classification using movie trailers

URL: http://arxiv.org/abs/2109.06488v1
Date: Tue, 14 Sep 2021 07:33:56 GMT
Title: Multilevel profiling of situation and dialogue-based deep networks for movie genre classification using movie trailers
Authors: Dinesh Kumar Vishwakarma, Mayank Jindal, Ayush Mittal, Aditya Sharma
Abstract summary: We propose a novel multi-modality: situation, dialogue, and metadata-based movie genre classification framework. We develop the English movie trailer dataset (EMTD), which contains 2000 Hollywood movie trailers belonging to five popular genres.
Score: 7.904790547594697
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Automated movie genre classification has emerged as an active and essential area of research and exploration. Short duration movie trailers provide useful insights about the movie as video content consists of the cognitive and the affective level features. Previous approaches were focused upon either cognitive or affective content analysis. In this paper, we propose a novel multi-modality: situation, dialogue, and metadata-based movie genre classification framework that takes both cognition and affect-based features into consideration. A pre-features fusion-based framework that takes into account: situation-based features from a regular snapshot of a trailer that includes nouns and verbs providing the useful affect-based mapping with the corresponding genres, dialogue (speech) based feature from audio, metadata which together provides the relevant information for cognitive and affect based video analysis. We also develop the English movie trailer dataset (EMTD), which contains 2000 Hollywood movie trailers belonging to five popular genres: Action, Romance, Comedy, Horror, and Science Fiction, and perform cross-validation on the standard LMTD-9 dataset for validating the proposed framework. The results demonstrate that the proposed methodology for movie genre classification has performed excellently as depicted by the F1 scores, precision, recall, and area under the precision-recall curves.

Related papers

Movie Trailer Genre Classification Using Multimodal Pretrained Features [1.1743167854433303]
We introduce a novel method for movie genre classification, capitalizing on a diverse set of readily accessible pretrained models. Our approach utilizes all video and audio frames of movie trailers without performing any temporal pooling. Our method outperforms state-of-the-art movie genre classification models in terms of precision, recall, and mean average precision (mAP)
arXiv Detail & Related papers (2024-10-11T15:38:05Z)
MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions [69.9122231800796]
We present MMTrail, a large-scale multi-modality video-language dataset incorporating more than 20M trailer clips with visual captions. We propose a systemic captioning framework, achieving various modality annotations with more than 27.1k hours of trailer videos. Our dataset potentially paves the path for fine-grained large multimodal-language model training.
arXiv Detail & Related papers (2024-07-30T16:43:24Z)
Movie101v2: Improved Movie Narration Benchmark [53.54176725112229]
Automatic movie narration aims to generate video-aligned plot descriptions to assist visually impaired audiences. We introduce Movie101v2, a large-scale, bilingual dataset with enhanced data quality specifically designed for movie narration. Based on our new benchmark, we baseline a range of large vision-language models, including GPT-4V, and conduct an in-depth analysis of the challenges in narration generation.
arXiv Detail & Related papers (2024-04-20T13:15:27Z)
Beyond Labels: Leveraging Deep Learning and LLMs for Content Metadata [1.6574413179773761]
Analyzing the metadata can help understand the user preferences to generate personalized recommendations and item cold starting. We present some of the challenges associated with using genre label information and propose a new way of examining the genre information. The Genre Spectrum helps capture the various nuanced genres in a title and our offline and online experiments corroborate the effectiveness of the approach.
arXiv Detail & Related papers (2023-09-15T22:11:29Z)
Movie101: A New Movie Understanding Benchmark [47.24519006577205]
We construct a large-scale Chinese movie benchmark, named Movie101. We propose a new metric called Movie Narration Score (MNScore) for movie narrating evaluation. For both two tasks, our proposed methods well leverage external knowledge and outperform carefully designed baselines.
arXiv Detail & Related papers (2023-05-20T08:43:51Z)
Movie Genre Classification by Language Augmentation and Shot Sampling [20.119729119879466]
We propose a Movie genre Classification method based on Language augmentatIon and shot samPling (Movie-CLIP) Movie-CLIP mainly consists of two parts: a language augmentation module to recognize language elements from the input audio, and a shot sampling module to select representative shots from the entire video. We evaluate our method on MovieNet and Condensed Movies datasets, achieving approximate 6-9% improvement in mean Average Precision (mAP) over the baselines.
arXiv Detail & Related papers (2022-03-24T18:15:12Z)
Film Trailer Generation via Task Decomposition [65.16768855902268]
We model movies as graphs, where nodes are shots and edges denote semantic relations between them. We learn these relations using joint contrastive training which leverages privileged textual information from screenplays. An unsupervised algorithm then traverses the graph and generates trailers that human judges prefer to ones generated by competitive supervised approaches.
arXiv Detail & Related papers (2021-11-16T20:50:52Z)
A Survey on Deep Learning Technique for Video Segmentation [147.0767454918527]
Video segmentation plays a critical role in a broad range of practical applications. Deep learning based approaches have been dedicated to video segmentation and delivered compelling performance.
arXiv Detail & Related papers (2021-07-02T15:51:07Z)
Rethinking movie genre classification with fine-grained semantic clustering [5.54966601302758]
We find large semantic variations between movies within a single genre definition. We expand these 'coarse' genre labels by identifying 'fine-grained' semantic information. Our approach is demonstrated on a newly introduced multi-modal 37,866,450 frame, 8,800 movie trailer dataset.
arXiv Detail & Related papers (2020-12-04T14:58:31Z)
Condensed Movies: Story Based Retrieval with Contextual Embeddings [83.73479493450009]
We create the Condensed Movies dataset (CMD) consisting of the key scenes from over 3K movies. The dataset is scalable, obtained automatically from YouTube, and is freely available for anybody to download and use. We provide a deep network baseline for text-to-video retrieval on our dataset, combining character, speech and visual cues into a single video embedding.
arXiv Detail & Related papers (2020-05-08T17:55:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.