Beyond Labels: Leveraging Deep Learning and LLMs for Content Metadata
- URL: http://arxiv.org/abs/2309.08787v1
- Date: Fri, 15 Sep 2023 22:11:29 GMT
- Title: Beyond Labels: Leveraging Deep Learning and LLMs for Content Metadata
- Authors: Saurabh Agrawal, John Trenkle, Jaya Kawale
- Abstract summary: Analyzing the metadata can help understand the user preferences to generate personalized recommendations and item cold starting.
We present some of the challenges associated with using genre label information and propose a new way of examining the genre information.
The Genre Spectrum helps capture the various nuanced genres in a title and our offline and online experiments corroborate the effectiveness of the approach.
- Score: 1.6574413179773761
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Content metadata plays a very important role in movie recommender systems as
it provides valuable information about various aspects of a movie such as
genre, cast, plot synopsis, box office summary, etc. Analyzing the metadata can
help understand the user preferences to generate personalized recommendations
and item cold starting. In this talk, we will focus on one particular type of
metadata - \textit{genre} labels. Genre labels associated with a movie or a TV
series help categorize a collection of titles into different themes and
correspondingly setting up the audience expectation. We present some of the
challenges associated with using genre label information and propose a new way
of examining the genre information that we call as the \textit{Genre Spectrum}.
The Genre Spectrum helps capture the various nuanced genres in a title and our
offline and online experiments corroborate the effectiveness of the approach.
Furthermore, we also talk about applications of LLMs in augmenting content
metadata which could eventually be used to achieve effective organization of
recommendations in user's 2-D home-grid.
Related papers
- ScreenWriter: Automatic Screenplay Generation and Movie Summarisation [55.20132267309382]
Video content has driven demand for textual descriptions or summaries that allow users to recall key plot points or get an overview without watching.
We propose the task of automatic screenplay generation, and a method, ScreenWriter, that operates only on video and produces output which includes dialogue, speaker names, scene breaks, and visual descriptions.
ScreenWriter introduces a novel algorithm to segment the video into scenes based on the sequence of visual vectors, and a novel method for the challenging problem of determining character names, based on a database of actors' faces.
arXiv Detail & Related papers (2024-10-17T07:59:54Z) - Movie101v2: Improved Movie Narration Benchmark [53.54176725112229]
Automatic movie narration aims to generate video-aligned plot descriptions to assist visually impaired audiences.
We introduce Movie101v2, a large-scale, bilingual dataset with enhanced data quality specifically designed for movie narration.
Based on our new benchmark, we baseline a range of large vision-language models, including GPT-4V, and conduct an in-depth analysis of the challenges in narration generation.
arXiv Detail & Related papers (2024-04-20T13:15:27Z) - Improving Retrieval in Theme-specific Applications using a Corpus
Topical Taxonomy [52.426623750562335]
We introduce ToTER (Topical taxonomy Enhanced Retrieval) framework.
ToTER identifies the central topics of queries and documents with the guidance of the taxonomy, and exploits their topical relatedness to supplement missing contexts.
As a plug-and-play framework, ToTER can be flexibly employed to enhance various PLM-based retrievers.
arXiv Detail & Related papers (2024-03-07T02:34:54Z) - SOVC: Subject-Oriented Video Captioning [59.04029220586337]
We propose a new video captioning task, Subject-Oriented Video Captioning (SOVC), which aims to allow users to specify the describing target via a bounding box.
To support this task, we construct two subject-oriented video captioning datasets based on two widely used video captioning datasets.
arXiv Detail & Related papers (2023-12-20T17:44:32Z) - Panel Transitions for Genre Analysis in Visual Narratives [1.320904960556043]
We present a novel approach to do a multi-modal analysis of genre based on comics and manga-style visual narratives.
We highlight some of the limitations and challenges of our existing computational approaches in modeling subjective labels.
arXiv Detail & Related papers (2023-12-14T08:05:09Z) - Description-Enhanced Label Embedding Contrastive Learning for Text
Classification [65.01077813330559]
Self-Supervised Learning (SSL) in model learning process and design a novel self-supervised Relation of Relation (R2) classification task.
Relation of Relation Learning Network (R2-Net) for text classification, in which text classification and R2 classification are treated as optimization targets.
external knowledge from WordNet to obtain multi-aspect descriptions for label semantic learning.
arXiv Detail & Related papers (2023-06-15T02:19:34Z) - Multilevel profiling of situation and dialogue-based deep networks for
movie genre classification using movie trailers [7.904790547594697]
We propose a novel multi-modality: situation, dialogue, and metadata-based movie genre classification framework.
We develop the English movie trailer dataset (EMTD), which contains 2000 Hollywood movie trailers belonging to five popular genres.
arXiv Detail & Related papers (2021-09-14T07:33:56Z) - MATCH: Metadata-Aware Text Classification in A Large Hierarchy [60.59183151617578]
MATCH is an end-to-end framework that leverages both metadata and hierarchy information.
We propose different ways to regularize the parameters and output probability of each child label by its parents.
Experiments on two massive text datasets with large-scale label hierarchies demonstrate the effectiveness of MATCH.
arXiv Detail & Related papers (2021-02-15T05:23:08Z) - Rethinking movie genre classification with fine-grained semantic
clustering [5.54966601302758]
We find large semantic variations between movies within a single genre definition.
We expand these 'coarse' genre labels by identifying 'fine-grained' semantic information.
Our approach is demonstrated on a newly introduced multi-modal 37,866,450 frame, 8,800 movie trailer dataset.
arXiv Detail & Related papers (2020-12-04T14:58:31Z) - A multimodal approach for multi-label movie genre classification [2.1342631813973507]
We created a dataset composed of trailer video clips, subtitles, synopses, and movie posters from 152,622 movie titles from The Movie Database.
The dataset was carefully curated and organized, and it was also made available as a contribution of this work.
arXiv Detail & Related papers (2020-06-01T00:51:39Z) - Minimally Supervised Categorization of Text with Metadata [40.13841133991089]
We propose MetaCat, a minimally supervised framework to categorize text with metadata.
We develop a generative process describing the relationships between words, documents, labels, and metadata.
Based on the same generative process, we synthesize training samples to address the bottleneck of label scarcity.
arXiv Detail & Related papers (2020-05-01T21:42:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.