Demystifying Visual Features of Movie Posters for Multi-Label Genre
Identification
- URL: http://arxiv.org/abs/2309.12022v1
- Date: Thu, 21 Sep 2023 12:39:36 GMT
- Title: Demystifying Visual Features of Movie Posters for Multi-Label Genre
Identification
- Authors: Utsav Kumar Nareti, Chandranath Adak, Soumi Chattopadhyay
- Abstract summary: We present a deep transformer network with a probabilistic module to identify the movie genres exclusively from the poster.
For experimental analysis, we procured 13882 number of posters of 13 genres from the Internet Movie Database (IMDb)
- Score: 0.393259574660092
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the film industry, movie posters have been an essential part of
advertising and marketing for many decades, and continue to play a vital role
even today in the form of digital posters through online, social media and OTT
platforms. Typically, movie posters can effectively promote and communicate the
essence of a film, such as its genre, visual style/ tone, vibe and storyline
cue/ theme, which are essential to attract potential viewers. Identifying the
genres of a movie often has significant practical applications in recommending
the film to target audiences. Previous studies on movie genre identification
are limited to subtitles, plot synopses, and movie scenes that are mostly
accessible after the movie release. Posters usually contain pre-release
implicit information to generate mass interest. In this paper, we work for
automated multi-label genre identification only from movie poster images,
without any aid of additional textual/meta-data information about movies, which
is one of the earliest attempts of its kind. Here, we present a deep
transformer network with a probabilistic module to identify the movie genres
exclusively from the poster. For experimental analysis, we procured 13882
number of posters of 13 genres from the Internet Movie Database (IMDb), where
our model performances were encouraging and even outperformed some major
contemporary architectures.
Related papers
- Movie101v2: Improved Movie Narration Benchmark [53.54176725112229]
We develop a large-scale, bilingual movie narration dataset, Movie101v2.
Taking into account the essential difficulties in achieving applicable movie narration, we break the long-term goal into three progressive stages.
Our findings reveal that achieving applicable movie narration generation is a fascinating goal that requires thorough research.
arXiv Detail & Related papers (2024-04-20T13:15:27Z) - Towards Automated Movie Trailer Generation [98.9854474456265]
We introduce Trailer Generation Transformer (TGT), a deep-learning framework utilizing an encoder-decoder architecture.
TGT movie encoder is tasked with contextualizing each movie shot representation via self-attention, while the autoregressive trailer decoder predicts the feature representation of the next trailer shot.
Our TGT significantly outperforms previous methods on a comprehensive suite of metrics.
arXiv Detail & Related papers (2024-04-04T14:28:34Z) - Beyond Labels: Leveraging Deep Learning and LLMs for Content Metadata [1.6574413179773761]
Analyzing the metadata can help understand the user preferences to generate personalized recommendations and item cold starting.
We present some of the challenges associated with using genre label information and propose a new way of examining the genre information.
The Genre Spectrum helps capture the various nuanced genres in a title and our offline and online experiments corroborate the effectiveness of the approach.
arXiv Detail & Related papers (2023-09-15T22:11:29Z) - MovieCLIP: Visual Scene Recognition in Movies [38.90153620199725]
Existing visual scene datasets in movies have limited and don't consider the visual scene transition within movie clips.
In this work, we address the problem of visual scene recognition in movies by first automatically curating a new and extensive movie-centric taxonomy.
Instead of manual annotations which can be expensive, we use CLIP to weakly label 1.12 million shots from 32K movie clips based on our proposed taxonomy.
arXiv Detail & Related papers (2022-10-20T07:38:56Z) - Ethnic Representation Analysis of Commercial Movie Posters [0.0]
We develop a novel approach for evaluating ethnic bias in the film industry by analyzing nearly 125,000 posters using state-of-the-art deep learning models.
Our analysis shows that while ethnic biases still exist, there is a trend of reduction of bias, as seen by several parameters.
An automatic approach to monitor ethnic diversity in the film industry, potentially integrated with financial value, may be of significant use for producers and policymakers.
arXiv Detail & Related papers (2022-07-17T13:13:02Z) - Film Trailer Generation via Task Decomposition [65.16768855902268]
We model movies as graphs, where nodes are shots and edges denote semantic relations between them.
We learn these relations using joint contrastive training which leverages privileged textual information from screenplays.
An unsupervised algorithm then traverses the graph and generates trailers that human judges prefer to ones generated by competitive supervised approaches.
arXiv Detail & Related papers (2021-11-16T20:50:52Z) - Multilevel profiling of situation and dialogue-based deep networks for
movie genre classification using movie trailers [7.904790547594697]
We propose a novel multi-modality: situation, dialogue, and metadata-based movie genre classification framework.
We develop the English movie trailer dataset (EMTD), which contains 2000 Hollywood movie trailers belonging to five popular genres.
arXiv Detail & Related papers (2021-09-14T07:33:56Z) - Political Posters Identification with Appearance-Text Fusion [49.55696202606098]
We propose a method that efficiently utilizes appearance features and text vectors to accurately classify political posters.
The majority of this work focuses on political posters that are designed to serve as a promotion of a certain political event.
arXiv Detail & Related papers (2020-12-19T16:14:51Z) - Movie Summarization via Sparse Graph Construction [65.16768855902268]
We propose a model that identifies TP scenes by building a sparse movie graph that represents relations between scenes and is constructed using multimodal information.
According to human judges, the summaries created by our approach are more informative and complete, and receive higher ratings, than the outputs of sequence-based models and general-purpose summarization algorithms.
arXiv Detail & Related papers (2020-12-14T13:54:34Z) - Rethinking movie genre classification with fine-grained semantic
clustering [5.54966601302758]
We find large semantic variations between movies within a single genre definition.
We expand these 'coarse' genre labels by identifying 'fine-grained' semantic information.
Our approach is demonstrated on a newly introduced multi-modal 37,866,450 frame, 8,800 movie trailer dataset.
arXiv Detail & Related papers (2020-12-04T14:58:31Z) - A Unified Framework for Shot Type Classification Based on Subject
Centric Lens [89.26211834443558]
We propose a learning framework for shot type recognition using Subject Guidance Network (SGNet)
SGNet separates the subject and background of a shot into two streams, serving as separate guidance maps for scale and movement type classification respectively.
We build a large-scale dataset MovieShots, which contains 46K shots from 7K movie trailers with annotations of their scale and movement types.
arXiv Detail & Related papers (2020-08-08T15:49:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.