Learning the Compositional Visual Coherence for Complementary
Recommendations
- URL: http://arxiv.org/abs/2006.04380v1
- Date: Mon, 8 Jun 2020 06:57:18 GMT
- Title: Learning the Compositional Visual Coherence for Complementary
Recommendations
- Authors: Zhi Li, Bo Wu, Qi Liu, Likang Wu, Hongke Zhao, Tao Mei
- Abstract summary: Complementary recommendations aim at providing users product suggestions that are supplementary and compatible with their obtained items.
We propose a novel Content Attentive Neural Network (CANN) to model the comprehensive compositional coherence on both global contents and semantic contents.
- Score: 62.60648815930101
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Complementary recommendations, which aim at providing users product
suggestions that are supplementary and compatible with their obtained items,
have become a hot topic in both academia and industry in recent years.
%However, it is challenging due to its complexity and subjectivity. Existing
work mainly focused on modeling the co-purchased relations between two items,
but the compositional associations of item collections are largely unexplored.
Actually, when a user chooses the complementary items for the purchased
products, it is intuitive that she will consider the visual semantic coherence
(such as color collocations, texture compatibilities) in addition to global
impressions. Towards this end, in this paper, we propose a novel Content
Attentive Neural Network (CANN) to model the comprehensive compositional
coherence on both global contents and semantic contents. Specifically, we first
propose a \textit{Global Coherence Learning} (GCL) module based on multi-heads
attention to model the global compositional coherence. Then, we generate the
semantic-focal representations from different semantic regions and design a
\textit{Focal Coherence Learning} (FCL) module to learn the focal compositional
coherence from different semantic-focal representations. Finally, we optimize
the CANN in a novel compositional optimization strategy. Extensive experiments
on the large-scale real-world data clearly demonstrate the effectiveness of
CANN compared with several state-of-the-art methods.
Related papers
- IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation [70.8833857249951]
IterComp is a novel framework that aggregates composition-aware model preferences from multiple models.
We propose an iterative feedback learning method to enhance compositionality in a closed-loop manner.
IterComp opens new research avenues in reward feedback learning for diffusion models and compositional generation.
arXiv Detail & Related papers (2024-10-09T17:59:13Z) - Embedding Generalized Semantic Knowledge into Few-Shot Remote Sensing Segmentation [26.542268630980814]
Few-shot segmentation (FSS) for remote sensing (RS) imagery leverages supporting information from limited annotated samples to achieve query segmentation of novel classes.
Previous efforts are dedicated to mining segmentation-guiding visual cues from a constrained set of support samples.
We propose a holistic semantic embedding (HSE) approach that effectively harnesses general semantic knowledge.
arXiv Detail & Related papers (2024-05-22T14:26:04Z) - Contextualization Distillation from Large Language Model for Knowledge
Graph Completion [51.126166442122546]
We introduce the Contextualization Distillation strategy, a plug-in-and-play approach compatible with both discriminative and generative KGC frameworks.
Our method begins by instructing large language models to transform compact, structural triplets into context-rich segments.
Comprehensive evaluations across diverse datasets and KGC techniques highlight the efficacy and adaptability of our approach.
arXiv Detail & Related papers (2024-01-28T08:56:49Z) - Multi-dimensional Fusion and Consistency for Semi-supervised Medical
Image Segmentation [10.628250457432499]
We introduce a novel semi-supervised learning framework tailored for medical image segmentation.
Central to our approach is the innovative Multi-scale Text-aware ViT-CNN Fusion scheme.
We propose the Multi-Axis Consistency framework for generating robust pseudo labels.
arXiv Detail & Related papers (2023-09-12T22:21:14Z) - Multi-Grained Multimodal Interaction Network for Entity Linking [65.30260033700338]
Multimodal entity linking task aims at resolving ambiguous mentions to a multimodal knowledge graph.
We propose a novel Multi-GraIned Multimodal InteraCtion Network $textbf(MIMIC)$ framework for solving the MEL task.
arXiv Detail & Related papers (2023-07-19T02:11:19Z) - ProCC: Progressive Cross-primitive Compatibility for Open-World
Compositional Zero-Shot Learning [29.591615811894265]
Open-World Compositional Zero-shot Learning (OW-CZSL) aims to recognize novel compositions of state and object primitives in images with no priors on the compositional space.
We propose a novel method, termed Progressive Cross-primitive Compatibility (ProCC), to mimic the human learning process for OW-CZSL tasks.
arXiv Detail & Related papers (2022-11-19T10:09:46Z) - Global-and-Local Collaborative Learning for Co-Salient Object Detection [162.62642867056385]
The goal of co-salient object detection (CoSOD) is to discover salient objects that commonly appear in a query group containing two or more relevant images.
We propose a global-and-local collaborative learning architecture, which includes a global correspondence modeling (GCM) and a local correspondence modeling (LCM)
The proposed GLNet is evaluated on three prevailing CoSOD benchmark datasets, demonstrating that our model trained on a small dataset (about 3k images) still outperforms eleven state-of-the-art competitors trained on some large datasets (about 8k-200k images)
arXiv Detail & Related papers (2022-04-19T14:32:41Z) - DCANet: Dense Context-Aware Network for Semantic Segmentation [4.960604671885823]
We propose a novel module, named Context-Aware (DCA) module, to adaptively integrate local detail information with global dependencies.
Driven by the contextual relationship, the DCA module can better achieve the aggregation of context information to generate more powerful features.
We empirically demonstrate the promising performance of our approach with extensive experiments on three challenging datasets.
arXiv Detail & Related papers (2021-04-06T14:12:22Z) - CoADNet: Collaborative Aggregation-and-Distribution Networks for
Co-Salient Object Detection [91.91911418421086]
Co-Salient Object Detection (CoSOD) aims at discovering salient objects that repeatedly appear in a given query group containing two or more relevant images.
One challenging issue is how to effectively capture co-saliency cues by modeling and exploiting inter-image relationships.
We present an end-to-end collaborative aggregation-and-distribution network (CoADNet) to capture both salient and repetitive visual patterns from multiple images.
arXiv Detail & Related papers (2020-11-10T04:28:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.