Related papers: Micro-video Tagging via Jointly Modeling Social Influence and Tag Relation

Micro-video Tagging via Jointly Modeling Social Influence and Tag Relation

URL: http://arxiv.org/abs/2303.08318v1
Date: Wed, 15 Mar 2023 02:13:34 GMT
Title: Micro-video Tagging via Jointly Modeling Social Influence and Tag Relation
Authors: Xiao Wang, Tian Gan, Yinwei Wei, Jianlong Wu, Dai Meng, Liqiang Nie
Abstract summary: 85.7% of micro-videos lack annotation. Existing methods mostly focus on analyzing video content, neglecting users' social influence and tag relation. We formulate micro-video tagging as a link prediction problem in a constructed heterogeneous network.
Score: 56.23157334014773
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The last decade has witnessed the proliferation of micro-videos on various user-generated content platforms. According to our statistics, around 85.7\% of micro-videos lack annotation. In this paper, we focus on annotating micro-videos with tags. Existing methods mostly focus on analyzing video content, neglecting users' social influence and tag relation. Meanwhile, existing tag relation construction methods suffer from either deficient performance or low tag coverage. To jointly model social influence and tag relation, we formulate micro-video tagging as a link prediction problem in a constructed heterogeneous network. Specifically, the tag relation (represented by tag ontology) is constructed in a semi-supervised manner. Then, we combine tag relation, video-tag annotation, and user-follow relation to build the network. Afterward, a better video and tag representation are derived through Behavior Spread modeling and visual and linguistic knowledge aggregation. Finally, the semantic similarity between each micro-video and all candidate tags is calculated in this video-tag network. Extensive experiments on industrial datasets of three verticals verify the superiority of our model compared with several state-of-the-art baselines.

Related papers

Cross-Modal Transfer from Memes to Videos: Addressing Data Scarcity in Hateful Video Detection [8.05088621131726]
Video-based hate speech detection remains under-explored, hindered by a lack of annotated datasets and the high cost of video annotation. We leverage meme datasets as both a substitution and an augmentation strategy for training hateful video detection models. Our results consistently outperform state-of-the-art benchmarks.
arXiv Detail & Related papers (2025-01-26T07:50:14Z)
Enhancing Multi-Modal Video Sentiment Classification Through Semi-Supervised Clustering [0.0]
We aim to improve video sentiment classification by focusing on two key aspects: the video itself, the accompanying text, and the acoustic features. We are developing a method that utilizes clustering-based semi-supervised pre-training to extract meaningful representations from the data.
arXiv Detail & Related papers (2025-01-11T08:04:39Z)
Text-Video Retrieval via Variational Multi-Modal Hypergraph Networks [25.96897989272303]
Main obstacle for text-video retrieval is the semantic gap between the textual nature of queries and the visual richness of video content. We propose chunk-level text-video matching, where the query chunks are extracted to describe a specific retrieval unit. We formulate the chunk-level matching as n-ary correlations modeling between words of the query and frames of the video.
arXiv Detail & Related papers (2024-01-06T09:38:55Z)
Multi-Label Meta Weighting for Long-Tailed Dynamic Scene Graph Generation [55.429541407920304]
Recognizing the predicate between subject and object pairs is imbalanced and multi-label in nature. Recent state-of-the-art methods predominantly focus on the most frequently occurring predicate classes. We introduce a multi-label meta-learning framework to deal with the biased predicate distribution.
arXiv Detail & Related papers (2023-06-16T18:14:23Z)
Hate Speech and Offensive Language Detection using an Emotion-aware Shared Encoder [1.8734449181723825]
Existing works on hate speech and offensive language detection produce promising results based on pre-trained transformer models. This paper addresses a multi-task joint learning approach which combines external emotional features extracted from another corpora. Our findings demonstrate that emotional knowledge helps to more reliably identify hate speech and offensive language across datasets.
arXiv Detail & Related papers (2023-02-17T09:31:06Z)
The ComMA Dataset V0.2: Annotating Aggression and Bias in Multilingual Social Media Discourse [1.465840097113565]
We discuss the development of a multilingual dataset annotated with a hierarchical, fine-grained tagset marking different types of aggression and the "context" in which they occur. The initial dataset consists of a total 15,000 annotated comments in four languages. As is usual on social media websites, a large number of these comments are multilingual, mostly code-mixed with English.
arXiv Detail & Related papers (2021-11-19T19:03:22Z)
Unboxing Engagement in YouTube Influencer Videos: An Attention-Based Approach [0.3686808512438362]
"What is said" through words (text) is more important than "how it is said" through imagery (video images) or acoustics (audio) in predicting video engagement.<n>We analyze unstructured data from long-form YouTube influencer videos.
arXiv Detail & Related papers (2020-12-22T19:32:52Z)
VLG-Net: Video-Language Graph Matching Network for Video Grounding [57.6661145190528]
Grounding language queries in videos aims at identifying the time interval (or moment) semantically relevant to a language query. We recast this challenge into an algorithmic graph matching problem. We demonstrate superior performance over state-of-the-art grounding methods on three widely used datasets.
arXiv Detail & Related papers (2020-11-19T22:32:03Z)
Content-based Analysis of the Cultural Differences between TikTok and Douyin [95.32409577885645]
Short-form video social media shifts away from the traditional media paradigm by telling the audience a dynamic story to attract their attention. In particular, different combinations of everyday objects can be employed to represent a unique scene that is both interesting and understandable. Offered by the same company, TikTok and Douyin are popular examples of such new media that has become popular in recent years. The hypothesis that they express cultural differences together with media fashion and social idiosyncrasy is the primary target of our research.
arXiv Detail & Related papers (2020-11-03T01:47:49Z)
Understanding YouTube Communities via Subscription-based Channel Embeddings [0.0]
This paper presents new methods to discover and classify YouTube channels. The methods use a self-supervised learning approach that leverages the public subscription pages of commenters. We create a new dataset to analyze the amount of traffic going to different political content.
arXiv Detail & Related papers (2020-10-19T22:00:04Z)
Labelling unlabelled videos from scratch with multi-modal self-supervision [82.60652426371936]
unsupervised labelling of a video dataset does not come for free from strong feature encoders. We propose a novel clustering method that allows pseudo-labelling of a video dataset without any human annotations. An extensive analysis shows that the resulting clusters have high semantic overlap to ground truth human labels.
arXiv Detail & Related papers (2020-06-24T12:28:17Z)
Comprehensive Information Integration Modeling Framework for Video Titling [124.11296128308396]
We integrate comprehensive sources of information, including the content of consumer-generated videos, the narrative comment sentences supplied by consumers, and the product attributes, in an end-to-end modeling framework. To tackle this issue, the proposed method consists of two processes, i.e., granular-level interaction modeling and abstraction-level story-line summarization. We collect a large-scale dataset accordingly from real-world data in Taobao, a world-leading e-commerce platform.
arXiv Detail & Related papers (2020-06-24T10:38:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.