Multi-Level Visual Similarity Based Personalized Tourist Attraction
Recommendation Using Geo-Tagged Photos
- URL: http://arxiv.org/abs/2109.08275v1
- Date: Fri, 17 Sep 2021 01:34:15 GMT
- Title: Multi-Level Visual Similarity Based Personalized Tourist Attraction
Recommendation Using Geo-Tagged Photos
- Authors: Ling Chen, Dandan Lyu, Shanshan Yu, and Gencai Chen
- Abstract summary: We propose multi-level visual similarity based personalized tourist attraction recommendation using geo-tagged photos.
We define four visual similarity levels and introduce a corresponding quintuplet loss to embed the visual contents of photos.
To capture the significances of different photos, we exploit the self-attention mechanism to obtain the visual representations of users and tourist attractions.
- Score: 7.176673263585931
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Geo-tagged photo based tourist attraction recommendation can discover users'
travel preferences from their taken photos, so as to recommend suitable tourist
attractions to them. However, existing visual content based methods cannot
fully exploit the user and tourist attraction information of photos to extract
visual features, and do not differentiate the significances of different
photos. In this paper, we propose multi-level visual similarity based
personalized tourist attraction recommendation using geo-tagged photos (MEAL).
MEAL utilizes the visual contents of photos and interaction behavior data to
obtain the final embeddings of users and tourist attractions, which are then
used to predict the visit probabilities. Specifically, by crossing the user and
tourist attraction information of photos, we define four visual similarity
levels and introduce a corresponding quintuplet loss to embed the visual
contents of photos. In addition, to capture the significances of different
photos, we exploit the self-attention mechanism to obtain the visual
representations of users and tourist attractions. We conducted experiments on a
dataset crawled from Flickr, and the experimental results proved the advantage
of this method.
Related papers
- Enhancing Historical Image Retrieval with Compositional Cues [3.2276097734075426]
We introduce a crucial factor from computational aesthetics, namely image composition, into this topic.
By explicitly integrating composition-related information extracted by CNN into the designed retrieval model, our method considers both the image's composition rules and semantic information.
arXiv Detail & Related papers (2024-03-21T10:51:19Z) - Tell Me What Is Good About This Property: Leveraging Reviews For
Segment-Personalized Image Collection Summarization [3.063926257586959]
We consider user intentions in the summarization of property visuals by analyzing property reviews.
By incorporating the insights from reviews in our visual summaries, we enhance the summaries by presenting the relevant content to a user.
Our experiments, including human perceptual studies, demonstrate the superiority of our cross-modal approach.
arXiv Detail & Related papers (2023-10-30T17:06:49Z) - Impressions: Understanding Visual Semiotics and Aesthetic Impact [66.40617566253404]
We present Impressions, a novel dataset through which to investigate the semiotics of images.
We show that existing multimodal image captioning and conditional generation models struggle to simulate plausible human responses to images.
This dataset significantly improves their ability to model impressions and aesthetic evaluations of images through fine-tuning and few-shot adaptation.
arXiv Detail & Related papers (2023-10-27T04:30:18Z) - Image Aesthetics Assessment via Learnable Queries [59.313054821874864]
We propose the Image Aesthetics Assessment via Learnable Queries (IAA-LQ) approach.
It adapts learnable queries to extract aesthetic features from pre-trained image features obtained from a frozen image encoder.
Experiments on real-world data demonstrate the advantages of IAA-LQ, beating the best state-of-the-art method by 2.2% and 2.1% in terms of SRCC and PLCC, respectively.
arXiv Detail & Related papers (2023-09-06T09:42:16Z) - StyleEDL: Style-Guided High-order Attention Network for Image Emotion
Distribution Learning [69.06749934902464]
We propose a style-guided high-order attention network for image emotion distribution learning termed StyleEDL.
StyleEDL interactively learns stylistic-aware representations of images by exploring the hierarchical stylistic information of visual contents.
In addition, we introduce a stylistic graph convolutional network to dynamically generate the content-dependent emotion representations.
arXiv Detail & Related papers (2023-08-06T03:22:46Z) - Identifying Professional Photographers Through Image Quality and
Aesthetics in Flickr [0.0]
This study reveals the lack of suitable data sets in photo and video sharing platforms.
We created one of the largest labelled data sets in Flickr with the multimodal data which has been open sourced.
We examined the relationship between the aesthetics and technical quality of a picture and the social activity of that picture.
arXiv Detail & Related papers (2023-07-04T14:55:37Z) - Photoswap: Personalized Subject Swapping in Images [56.2650908740358]
Photoswap learns the visual concept of the subject from reference images and swaps it into the target image using pre-trained diffusion models.
Photoswap significantly outperforms baseline methods in human ratings across subject swapping, background preservation, and overall quality.
arXiv Detail & Related papers (2023-05-29T17:56:13Z) - FaIRCoP: Facial Image Retrieval using Contrastive Personalization [43.293482565385055]
Retrieving facial images from attributes plays a vital role in various systems such as face recognition and suspect identification.
Existing methods do so by comparing specific characteristics from the user's mental image against the suggested images.
We propose a method that uses the user's feedback to label images as either similar or dissimilar to the target image.
arXiv Detail & Related papers (2022-05-28T09:52:09Z) - From A Glance to "Gotcha": Interactive Facial Image Retrieval with
Progressive Relevance Feedback [72.29919762941029]
We propose an end-to-end framework to retrieve facial images with relevance feedback progressively provided by the witness.
With no need of any extra annotations, our model can be applied at the cost of a little response effort.
arXiv Detail & Related papers (2020-07-30T18:46:25Z) - Unsupervised Learning of Landmarks based on Inter-Intra Subject
Consistencies [72.67344725725961]
We present a novel unsupervised learning approach to image landmark discovery by incorporating the inter-subject landmark consistencies on facial images.
This is achieved via an inter-subject mapping module that transforms original subject landmarks based on an auxiliary subject-related structure.
To recover from the transformed images back to the original subject, the landmark detector is forced to learn spatial locations that contain the consistent semantic meanings both for the paired intra-subject images and between the paired inter-subject images.
arXiv Detail & Related papers (2020-04-16T20:38:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.