Related papers: Text-Based Product Matching -- Semi-Supervised Clustering Approach

Related papers

Personalized Product Search Ranking: A Multi-Task Learning Approach with Tabular and Non-Tabular Data [5.361964008135103]
We present a novel model architecture for optimizing personalized product search ranking using a multi-task learning framework.<n>We propose a scalable relevance labeling mechanism based on click-through rates, click positions, and semantic similarity.<n> Experimental results show that combining non-tabular data with advanced embedding techniques in multi-task learning paradigm significantly enhances model performance.
arXiv Detail & Related papers (2025-08-13T09:15:08Z)
Pre-training Generative Recommender with Multi-Identifier Item Tokenization [78.87007819266957]
We propose MTGRec to augment token sequence data for Generative Recommender pre-training. Our approach involves two key innovations: multi-identifier item tokenization and curriculum recommender pre-training. Extensive experiments on three public benchmark datasets demonstrate that MTGRec significantly outperforms both traditional and generative recommendation baselines.
arXiv Detail & Related papers (2025-04-06T08:03:03Z)
Generative Retrieval and Alignment Model: A New Paradigm for E-commerce Retrieval [12.705202836685189]
This paper introduces a novel e-commerce retrieval paradigm: the Generative Retrieval and Alignment Model (GRAM) GRAM employs joint training on text information from both queries and products to generate shared text codes. GRAM significantly outperforms traditional models and the latest generative retrieval models.
arXiv Detail & Related papers (2025-04-02T06:40:09Z)
Multimodal semantic retrieval for product search [6.185573921868495]
We build a multimodal representation for product items in e-commerce search in contrast to pure-text representation of products. We demonstrate that a multimodal representation scheme for a product can show improvement on purchase recall or relevance accuracy in semantic retrieval.
arXiv Detail & Related papers (2025-01-13T14:34:26Z)
Exploring Fine-grained Retail Product Discrimination with Zero-shot Object Classification Using Vision-Language Models [50.370043676415875]
In smart retail applications, the large number of products and their frequent turnover necessitate reliable zero-shot object classification methods. We introduce the MIMEX dataset, comprising 28 distinct product categories. We benchmark the zero-shot object classification performance of state-of-the-art vision-language models (VLMs) on the proposed MIMEX dataset.
arXiv Detail & Related papers (2024-09-23T12:28:40Z)
STORE: Streamlining Semantic Tokenization and Generative Recommendation with A Single LLM [59.08493154172207]
We propose a unified framework to streamline the semantic tokenization and generative recommendation process. We formulate semantic tokenization as a text-to-token task and generative recommendation as a token-to-token task, supplemented by a token-to-text reconstruction task and a text-to-token auxiliary task. All these tasks are framed in a generative manner and trained using a single large language model (LLM) backbone.
arXiv Detail & Related papers (2024-09-11T13:49:48Z)
CART: A Generative Cross-Modal Retrieval Framework with Coarse-To-Fine Semantic Modeling [53.97609687516371]
Cross-modal retrieval aims to search for instances, which are semantically related to the query through the interaction of different modal data.<n>Traditional solutions utilize a single-tower or dual-tower framework to explicitly compute the score between queries and candidates.<n>We propose a generative cross-modal retrieval framework (CART) based on coarse-to-fine semantic modeling.
arXiv Detail & Related papers (2024-06-25T12:47:04Z)
MMGRec: Multimodal Generative Recommendation with Transformer Model [81.61896141495144]
MMGRec aims to introduce a generative paradigm into multimodal recommendation. We first devise a hierarchical quantization method Graph CF-RQVAE to assign Rec-ID for each item from its multimodal information. We then train a Transformer-based recommender to generate the Rec-IDs of user-preferred items based on historical interaction sequences.
arXiv Detail & Related papers (2024-04-25T12:11:27Z)
Enhanced E-Commerce Attribute Extraction: Innovating with Decorative Relation Correction and LLAMA 2.0-Based Annotation [4.81846973621209]
We propose a pioneering framework that integrates BERT for classification, a Conditional Random Fields (CRFs) layer for attribute value extraction, and Large Language Models (LLMs) for data annotation. Our approach capitalizes on the robust representation learning of BERT, synergized with the sequence decoding prowess of CRFs, to adeptly identify and extract attribute values. Our methodology is rigorously validated on various datasets, including Walmart, BestBuy's e-commerce NER dataset, and the CoNLL dataset.
arXiv Detail & Related papers (2023-12-09T08:26:30Z)
A Unified Generative Approach to Product Attribute-Value Identification [6.752749933406399]
We explore a generative approach to the product attribute-value identification (PAVI) task. We finetune a pre-trained generative model, T5, to decode a set of attribute-value pairs as a target sequence from the given product text. Experimental results confirm that our generation-based approach outperforms the existing extraction and classification-based methods.
arXiv Detail & Related papers (2023-06-09T00:33:30Z)
Exploiting Diversity of Unlabeled Data for Label-Efficient Semi-Supervised Active Learning [57.436224561482966]
Active learning is a research area that addresses the issues of expensive labeling by selecting the most important samples for labeling. We introduce a new diversity-based initial dataset selection algorithm to select the most informative set of samples for initial labeling in the active learning setting. Also, we propose a novel active learning query strategy, which uses diversity-based sampling on consistency-based embeddings.
arXiv Detail & Related papers (2022-07-25T16:11:55Z)
Entity-Graph Enhanced Cross-Modal Pretraining for Instance-level Product Retrieval [152.3504607706575]
This research aims to conduct weakly-supervised multi-modal instance-level product retrieval for fine-grained product categories. We first contribute the Product1M datasets, and define two real practical instance-level retrieval tasks. We exploit to train a more effective cross-modal model which is adaptively capable of incorporating key concept information from the multi-modal data.
arXiv Detail & Related papers (2022-06-17T15:40:45Z)
Interpretable Methods for Identifying Product Variants [0.2589904091148018]
We introduce a novel approach to identifying product variants. It combines both constrained clustering and tailored NLP techniques. We design the algorithm to meet certain business criteria, including meeting high accuracy requirements.
arXiv Detail & Related papers (2021-04-12T14:37:16Z)
Automatic Validation of Textual Attribute Values in E-commerce Catalog by Learning with Limited Labeled Data [61.789797281676606]
We propose a novel meta-learning latent variable approach, called MetaBridge. It can learn transferable knowledge from a subset of categories with limited labeled data. It can capture the uncertainty of never-seen categories with unlabeled data.
arXiv Detail & Related papers (2020-06-15T21:31:05Z)
A Hybrid Approach to Enhance Pure Collaborative Filtering based on Content Feature Relationship [0.17188280334580192]
We introduce a novel method to extract the implicit relationship between content features using a sort of well-known methods from the natural language processing domain, namely Word2Vec. Next, we propose a novel content-based recommendation system that employs the relationship to determine vector representations for items. Our evaluation results demonstrate that it can predict the preference a user would have for a set of items as good as pure collaborative filtering.
arXiv Detail & Related papers (2020-05-17T02:20:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.