Related papers: Design and Evaluation of Whole-Page Experience Optimization for E-commerce Search

Design and Evaluation of Whole-Page Experience Optimization for E-commerce Search

URL: http://arxiv.org/abs/2602.02514v1
Date: Fri, 23 Jan 2026 07:41:04 GMT
Title: Design and Evaluation of Whole-Page Experience Optimization for E-commerce Search
Authors: Pratik Lahiri, Bingqing Ge, Zhou Qin, Aditya Jumde, Shuning Huo, Lucas Scottini, Yi Liu, Mahmoud Mamlouk, Wenyang Liu,
Abstract summary: E-commerce search results pages (SRPs) are evolving from linear lists to complex, non-linear layouts.<n>We propose a novel Whole-Page Experience Optimization Framework to bridge the gap between short-term signals and long-term satisfaction metrics.<n>We validate our approach through industry-scale A/B testing, where the model demonstrated a 1.86% improvement in brand relevance.
Score: 4.089644567431606
License: http://creativecommons.org/licenses/by/4.0/
Abstract: E-commerce Search Results Pages (SRPs) are evolving from linear lists to complex, non-linear layouts, rendering traditional position-biased ranking models insufficient. Moreover, existing optimization frameworks typically maximize short-term signals (e.g., clicks, same-day revenue) because long-term satisfaction metrics (e.g., expected two-week revenue) involve delayed feedback and challenging long-horizon credit attribution. To bridge these gaps, we propose a novel Whole-Page Experience Optimization Framework. Unlike traditional list-wise rankers, our approach explicitly models the interplay between item relevance, 2D positional layout, and visual elements. We use a causal framework to develop metrics for measuring long-term user satisfaction based on quasi-experimental data. We validate our approach through industry-scale A/B testing, where the model demonstrated a 1.86% improvement in brand relevance (our primary customer experience metric) while simultaneously achieving a statistically significant revenue uplift of +0.05%

Related papers

Leveraging Generative Models for Real-Time Query-Driven Text Summarization in Large-Scale Web Search [54.987957691350665]
Query-Driven Text Summarization (QDTS) aims to generate concise and informative summaries from textual documents based on a given query.<n>Traditional extractive summarization models, based primarily on ranking candidate summary segments, have been the dominant approach in industrial applications.<n>We propose a novel framework to pioneer the application of generative models to address real-time QDTS in industrial web search.
arXiv Detail & Related papers (2025-08-28T08:51:51Z)
Generating Query-Relevant Document Summaries via Reinforcement Learning [5.651096645934245]
ReLSum is a reinforcement learning framework designed to generate query-relevant summaries of product descriptions optimized for search relevance.<n>The framework employs a trainable large language model (LLM) to produce summaries, which are then used as input for a cross-encoder ranking model.<n> Experimental results demonstrate significant improvements in offline metrics, including recall and NDCG, as well as online user engagement metrics.
arXiv Detail & Related papers (2025-08-11T18:52:28Z)
NAM: A Normalization Attention Model for Personalized Product Search In Fliggy [14.447458070745231]
We propose a Normalization Attention Model (NAM) for personalized product search.<n>We show that our proposed NAM model significantly outperforms state-of-the-art baseline models.
arXiv Detail & Related papers (2025-06-10T02:46:05Z)
Optimizing Recall or Relevance? A Multi-Task Multi-Head Approach for Item-to-Item Retrieval in Recommendation [23.61568268070558]
We propose a Multi-Task and Multi-Head I2I retrieval model that achieves both high recall and semantic relevance.<n>We evaluate MTMH using proprietary data from a commercial platform serving billions of users and demonstrate that it can improve recall by up to 14.4% and semantic relevance by up to 56.6%.
arXiv Detail & Related papers (2025-06-06T17:00:20Z)
On the Role of Feedback in Test-Time Scaling of Agentic AI Workflows [71.92083784393418]
Agentic AI (systems that autonomously plan and act) are becoming widespread, yet their task success rate on complex tasks remains low.<n>Inference-time alignment relies on three components: sampling, evaluation, and feedback.<n>We introduce Iterative Agent Decoding (IAD), a procedure that repeatedly inserts feedback extracted from different forms of critiques.
arXiv Detail & Related papers (2025-04-02T17:40:47Z)
LREF: A Novel LLM-based Relevance Framework for E-commerce [14.217396055372053]
This paper proposes a novel framework called the LLM-based RElevance Framework (LREF) aimed at enhancing e-commerce search relevance.<n>We evaluate the performance of the framework through a series of offline experiments on large-scale real-world datasets, as well as online A/B testing.<n>The model was deployed in a well-known e-commerce application, yielding substantial commercial benefits.
arXiv Detail & Related papers (2025-03-12T10:10:30Z)
Client-Centric Federated Adaptive Optimization [78.30827455292827]
Federated Learning (FL) is a distributed learning paradigm where clients collaboratively train a model while keeping their own data private.<n>We propose Federated-Centric Adaptive Optimization, which is a class of novel federated optimization approaches.
arXiv Detail & Related papers (2025-01-17T04:00:50Z)
Margin Matching Preference Optimization: Enhanced Model Alignment with Granular Feedback [64.67540769692074]
Large language models (LLMs) fine-tuned with alignment techniques, such as reinforcement learning from human feedback, have been instrumental in developing some of the most capable AI systems to date.<n>We introduce an approach called Margin Matching Preference Optimization (MMPO), which incorporates relative quality margins into optimization, leading to improved LLM policies and reward models.<n>Experiments with both human and AI feedback data demonstrate that MMPO consistently outperforms baseline methods, often by a substantial margin, on popular benchmarks including MT-bench and RewardBench.
arXiv Detail & Related papers (2024-10-04T04:56:11Z)
Generative Pre-trained Ranking Model with Over-parameterization at Web-Scale (Extended Abstract) [73.57710917145212]
Learning to rank is widely employed in web searches to prioritize pertinent webpages based on input queries. We propose a emphulineGenerative ulineSemi-ulineSupervised ulinePre-trained (GS2P) model to address these challenges. We conduct extensive offline experiments on both a publicly available dataset and a real-world dataset collected from a large-scale search engine.
arXiv Detail & Related papers (2024-09-25T03:39:14Z)
Optimizing E-commerce Search: Toward a Generalizable and Rank-Consistent Pre-Ranking Model [13.573766789458118]
In large e-commerce platforms, the pre-ranking phase is crucial for filtering out the bulk of products in advance for the downstream ranking module. We propose a novel method: a Generalizable and RAnk-ConsistEnt Pre-Ranking Model (GRACE), which achieves: 1) Ranking consistency by introducing multiple binary classification tasks that predict whether a product is within the top-k results as estimated by the ranking model, which facilitates the addition of learning objectives on common point-wise ranking models; 2) Generalizability through contrastive learning of representation for all products by pre-training on a subset of ranking product embeddings
arXiv Detail & Related papers (2024-05-09T07:55:52Z)
Improving Text Matching in E-Commerce Search with A Rationalizable, Intervenable and Fast Entity-Based Relevance Model [78.80174696043021]
We propose a novel model called the Entity-Based Relevance Model (EBRM) The decomposition allows us to use a Cross-encoder QE relevance module for high accuracy. We also show that pretraining the QE module with auto-generated QE data from user logs can further improve the overall performance.
arXiv Detail & Related papers (2023-07-01T15:44:53Z)
PreSizE: Predicting Size in E-Commerce using Transformers [76.33790223551074]
PreSizE is a novel deep learning framework which utilizes Transformers for accurate size prediction. We demonstrate that PreSizE is capable of achieving superior prediction performance compared to previous state-of-the-art baselines. As a proof of concept, we demonstrate that size predictions made by PreSizE can be effectively integrated into an existing production recommender system.
arXiv Detail & Related papers (2021-05-04T15:23:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.