VLSNR:Vision-Linguistics Coordination Time Sequence-aware News
Recommendation
- URL: http://arxiv.org/abs/2210.02946v1
- Date: Thu, 6 Oct 2022 14:27:37 GMT
- Title: VLSNR:Vision-Linguistics Coordination Time Sequence-aware News
Recommendation
- Authors: Songhao Han (1), Wei Huang (1), Xiaotian Luan (2) ((1) Beihang
University, (2) Peking University)
- Abstract summary: multimodal semantics is beneficial for enhancing the comprehension of users' temporal and long-lasting interests.
In our work, we propose a vision-linguistics coordinate time sequence news recommendation.
We also construct a large scale multimodal news recommendation dataset V-MIND.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: News representation and user-oriented modeling are both essential for news
recommendation. Most existing methods are based on textual information but
ignore the visual information and users' dynamic interests. However, compared
to textual only content, multimodal semantics is beneficial for enhancing the
comprehension of users' temporal and long-lasting interests. In our work, we
propose a vision-linguistics coordinate time sequence news recommendation.
Firstly, a pretrained multimodal encoder is applied to embed images and texts
into the same feature space. Then the self-attention network is used to learn
the chronological sequence. Additionally, an attentional GRU network is
proposed to model user preference in terms of time adequately. Finally, the
click history and user representation are embedded to calculate the ranking
scores for candidate news. Furthermore, we also construct a large scale
multimodal news recommendation dataset V-MIND. Experimental results show that
our model outperforms baselines and achieves SOTA on our independently
constructed dataset.
Related papers
- TARN-VIST: Topic Aware Reinforcement Network for Visual Storytelling [14.15543866199545]
As a cross-modal task, visual storytelling aims to generate a story for an ordered image sequence automatically.
We propose a novel method, Topic Aware Reinforcement Network for VIsual StoryTelling (TARN-VIST)
In particular, we pre-extracted the topic information of stories from both visual and linguistic perspectives.
arXiv Detail & Related papers (2024-03-18T08:01:23Z) - MISSRec: Pre-training and Transferring Multi-modal Interest-aware
Sequence Representation for Recommendation [61.45986275328629]
We propose MISSRec, a multi-modal pre-training and transfer learning framework for sequential recommendation.
On the user side, we design a Transformer-based encoder-decoder model, where the contextual encoder learns to capture the sequence-level multi-modal user interests.
On the candidate item side, we adopt a dynamic fusion module to produce user-adaptive item representation.
arXiv Detail & Related papers (2023-08-22T04:06:56Z) - Modeling Multi-interest News Sequence for News Recommendation [0.6787897491422114]
A session-based news recommender system recommends the next news to a user by modeling the potential interests embedded in a sequence of news read/clicked by her/him in a session.
This paper proposes a multi-interest news sequence (MINS) model for news recommendation.
In MINS, a news based on self-attention is devised on learn an informative embedding for each piece of news, and then a novel parallel interest network is devised to extract the potential multiple interests embedded in the news sequence in preparation for the subsequent next-news recommendations.
arXiv Detail & Related papers (2022-07-15T08:03:37Z) - A Closer Look at Debiased Temporal Sentence Grounding in Videos:
Dataset, Metric, and Approach [53.727460222955266]
Temporal Sentence Grounding in Videos (TSGV) aims to ground a natural language sentence in an untrimmed video.
Recent studies have found that current benchmark datasets may have obvious moment annotation biases.
We introduce a new evaluation metric "dR@n,IoU@m" that discounts the basic recall scores to alleviate the inflating evaluation caused by biased datasets.
arXiv Detail & Related papers (2022-03-10T08:58:18Z) - Perceptual Score: What Data Modalities Does Your Model Perceive? [73.75255606437808]
We introduce the perceptual score, a metric that assesses the degree to which a model relies on the different subsets of the input features.
We find that recent, more accurate multi-modal models for visual question-answering tend to perceive the visual data less than their predecessors.
Using the perceptual score also helps to analyze model biases by decomposing the score into data subset contributions.
arXiv Detail & Related papers (2021-10-27T12:19:56Z) - Learning Dual Dynamic Representations on Time-Sliced User-Item
Interaction Graphs for Sequential Recommendation [62.30552176649873]
We devise a novel Dynamic Representation Learning model for Sequential Recommendation (DRL-SRe)
To better model the user-item interactions for characterizing the dynamics from both sides, the proposed model builds a global user-item interaction graph for each time slice.
To enable the model to capture fine-grained temporal information, we propose an auxiliary temporal prediction task over consecutive time slices.
arXiv Detail & Related papers (2021-09-24T07:44:27Z) - Neural News Recommendation with Collaborative News Encoding and
Structural User Encoding [18.407727437603178]
We propose a news recommendation framework consisting of collaborative news encoding (CNE) and structural user encoding (SUE)
Experiment results on the MIND dataset validate the effectiveness of our model to improve the performance of news recommendation.
arXiv Detail & Related papers (2021-09-02T07:16:42Z) - Dynamic Graph Collaborative Filtering [64.87765663208927]
Dynamic recommendation is essential for recommender systems to provide real-time predictions based on sequential data.
Here we propose Dynamic Graph Collaborative Filtering (DGCF), a novel framework leveraging dynamic graphs to capture collaborative and sequential relations.
Our approach achieves higher performance when the dataset contains less action repetition, indicating the effectiveness of integrating dynamic collaborative information.
arXiv Detail & Related papers (2021-01-08T04:16:24Z) - TAGNN: Target Attentive Graph Neural Networks for Session-based
Recommendation [66.04457457299218]
We propose a novel target attentive graph neural network (TAGNN) model for session-based recommendation.
In TAGNN, target-aware attention adaptively activates different user interests with respect to varied target items.
The learned interest representation vector varies with different target items, greatly improving the expressiveness of the model.
arXiv Detail & Related papers (2020-05-06T14:17:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.