Why Do We Click: Visual Impression-aware News Recommendation
- URL: http://arxiv.org/abs/2109.12651v1
- Date: Sun, 26 Sep 2021 16:58:14 GMT
- Title: Why Do We Click: Visual Impression-aware News Recommendation
- Authors: Jiahao Xun, Shengyu Zhang, Zhou Zhao, Jieming Zhu, Qi Zhang, Jingjie
Li, Xiuqiang He, Xiaofei He, Tat-Seng Chua, Fei Wu
- Abstract summary: This work is inspired by the fact that users make their click decisions mostly based on the visual impression they perceive when browsing news.
We propose to capture such visual impression information with visual-semantic modeling for news recommendation.
In addition, we inspect the impression from a global view and take structural information, such as the arrangement of different fields and spatial position of different words on the impression.
- Score: 108.73539346064386
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: There is a soaring interest in the news recommendation research scenario due
to the information overload. To accurately capture users' interests, we propose
to model multi-modal features, in addition to the news titles that are widely
used in existing works, for news recommendation. Besides, existing research
pays little attention to the click decision-making process in designing
multi-modal modeling modules. In this work, inspired by the fact that users
make their click decisions mostly based on the visual impression they perceive
when browsing news, we propose to capture such visual impression information
with visual-semantic modeling for news recommendation. Specifically, we devise
the local impression modeling module to simultaneously attend to decomposed
details in the impression when understanding the semantic meaning of news
title, which could explicitly get close to the process of users reading news.
In addition, we inspect the impression from a global view and take structural
information, such as the arrangement of different fields and spatial position
of different words on the impression, into the modeling of multiple modalities.
To accommodate the research of visual impression-aware news recommendation, we
extend the text-dominated news recommendation dataset MIND by adding snapshot
impression images and will release it to nourish the research field. Extensive
comparisons with the state-of-the-art news recommenders along with the in-depth
analyses demonstrate the effectiveness of the proposed method and the promising
capability of modeling visual impressions for the content-based recommenders.
Related papers
- Towards Retrieval-Augmented Architectures for Image Captioning [81.11529834508424]
This work presents a novel approach towards developing image captioning models that utilize an external kNN memory to improve the generation process.
Specifically, we propose two model variants that incorporate a knowledge retriever component that is based on visual similarities.
We experimentally validate our approach on COCO and nocaps datasets and demonstrate that incorporating an explicit external memory can significantly enhance the quality of captions.
arXiv Detail & Related papers (2024-05-21T18:02:07Z) - Information Screening whilst Exploiting! Multimodal Relation Extraction
with Feature Denoising and Multimodal Topic Modeling [96.75821232222201]
Existing research on multimodal relation extraction (MRE) faces two co-existing challenges, internal-information over-utilization and external-information under-exploitation.
We propose a novel framework that simultaneously implements the idea of internal-information screening and external-information exploiting.
arXiv Detail & Related papers (2023-05-19T14:56:57Z) - Named Entity and Relation Extraction with Multi-Modal Retrieval [51.660650522630526]
Multi-modal named entity recognition (NER) and relation extraction (RE) aim to leverage relevant image information to improve the performance of NER and RE.
We propose a novel Multi-modal Retrieval based framework (MoRe)
MoRe contains a text retrieval module and an image-based retrieval module, which retrieve related knowledge of the input text and image in the knowledge corpus respectively.
arXiv Detail & Related papers (2022-12-03T13:11:32Z) - Focus! Relevant and Sufficient Context Selection for News Image
Captioning [69.36678144800936]
News Image Captioning requires describing an image by leveraging additional context from a news article.
We propose to use the pre-trained vision and language retrieval model CLIP to localize the visually grounded entities in the news article.
Our experiments demonstrate that by simply selecting a better context from the article, we can significantly improve the performance of existing models.
arXiv Detail & Related papers (2022-12-01T20:00:27Z) - VLSNR:Vision-Linguistics Coordination Time Sequence-aware News
Recommendation [0.0]
multimodal semantics is beneficial for enhancing the comprehension of users' temporal and long-lasting interests.
In our work, we propose a vision-linguistics coordinate time sequence news recommendation.
We also construct a large scale multimodal news recommendation dataset V-MIND.
arXiv Detail & Related papers (2022-10-06T14:27:37Z) - Modeling Multi-interest News Sequence for News Recommendation [0.6787897491422114]
A session-based news recommender system recommends the next news to a user by modeling the potential interests embedded in a sequence of news read/clicked by her/him in a session.
This paper proposes a multi-interest news sequence (MINS) model for news recommendation.
In MINS, a news based on self-attention is devised on learn an informative embedding for each piece of news, and then a novel parallel interest network is devised to extract the potential multiple interests embedded in the news sequence in preparation for the subsequent next-news recommendations.
arXiv Detail & Related papers (2022-07-15T08:03:37Z) - On the Overlooked Significance of Underutilized Contextual Features in
Recent News Recommendation Models [14.40821643757877]
We show that the articles' contextual features, such as click-through-rate, popularity, or freshness, were either neglected or underutilized recently.
We design a purposefully simple contextual module that can boost the previous news recommendation models by a large margin.
arXiv Detail & Related papers (2021-12-29T02:47:56Z) - Graph Enhanced Representation Learning for News Recommendation [85.3295446374509]
We propose a news recommendation method which can enhance the representation learning of users and news.
In our method, users and news are both viewed as nodes in a bipartite graph constructed from historical user click behaviors.
arXiv Detail & Related papers (2020-03-31T15:27:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.