Design and Evaluation of Whole-Page Experience Optimization for E-commerce Search
- URL: http://arxiv.org/abs/2602.02514v1
- Date: Fri, 23 Jan 2026 07:41:04 GMT
- Title: Design and Evaluation of Whole-Page Experience Optimization for E-commerce Search
- Authors: Pratik Lahiri, Bingqing Ge, Zhou Qin, Aditya Jumde, Shuning Huo, Lucas Scottini, Yi Liu, Mahmoud Mamlouk, Wenyang Liu,
- Abstract summary: E-commerce search results pages (SRPs) are evolving from linear lists to complex, non-linear layouts.<n>We propose a novel Whole-Page Experience Optimization Framework to bridge the gap between short-term signals and long-term satisfaction metrics.<n>We validate our approach through industry-scale A/B testing, where the model demonstrated a 1.86% improvement in brand relevance.
- Score: 4.089644567431606
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: E-commerce Search Results Pages (SRPs) are evolving from linear lists to complex, non-linear layouts, rendering traditional position-biased ranking models insufficient. Moreover, existing optimization frameworks typically maximize short-term signals (e.g., clicks, same-day revenue) because long-term satisfaction metrics (e.g., expected two-week revenue) involve delayed feedback and challenging long-horizon credit attribution. To bridge these gaps, we propose a novel Whole-Page Experience Optimization Framework. Unlike traditional list-wise rankers, our approach explicitly models the interplay between item relevance, 2D positional layout, and visual elements. We use a causal framework to develop metrics for measuring long-term user satisfaction based on quasi-experimental data. We validate our approach through industry-scale A/B testing, where the model demonstrated a 1.86% improvement in brand relevance (our primary customer experience metric) while simultaneously achieving a statistically significant revenue uplift of +0.05%
Related papers
- Leveraging Generative Models for Real-Time Query-Driven Text Summarization in Large-Scale Web Search [54.987957691350665]
Query-Driven Text Summarization (QDTS) aims to generate concise and informative summaries from textual documents based on a given query.<n>Traditional extractive summarization models, based primarily on ranking candidate summary segments, have been the dominant approach in industrial applications.<n>We propose a novel framework to pioneer the application of generative models to address real-time QDTS in industrial web search.
arXiv Detail & Related papers (2025-08-28T08:51:51Z) - Generating Query-Relevant Document Summaries via Reinforcement Learning [5.651096645934245]
ReLSum is a reinforcement learning framework designed to generate query-relevant summaries of product descriptions optimized for search relevance.<n>The framework employs a trainable large language model (LLM) to produce summaries, which are then used as input for a cross-encoder ranking model.<n> Experimental results demonstrate significant improvements in offline metrics, including recall and NDCG, as well as online user engagement metrics.
arXiv Detail & Related papers (2025-08-11T18:52:28Z) - NAM: A Normalization Attention Model for Personalized Product Search In Fliggy [14.447458070745231]
We propose a Normalization Attention Model (NAM) for personalized product search.<n>We show that our proposed NAM model significantly outperforms state-of-the-art baseline models.
arXiv Detail & Related papers (2025-06-10T02:46:05Z) - Optimizing Recall or Relevance? A Multi-Task Multi-Head Approach for Item-to-Item Retrieval in Recommendation [23.61568268070558]
We propose a Multi-Task and Multi-Head I2I retrieval model that achieves both high recall and semantic relevance.<n>We evaluate MTMH using proprietary data from a commercial platform serving billions of users and demonstrate that it can improve recall by up to 14.4% and semantic relevance by up to 56.6%.
arXiv Detail & Related papers (2025-06-06T17:00:20Z) - On the Role of Feedback in Test-Time Scaling of Agentic AI Workflows [71.92083784393418]
Agentic AI (systems that autonomously plan and act) are becoming widespread, yet their task success rate on complex tasks remains low.<n>Inference-time alignment relies on three components: sampling, evaluation, and feedback.<n>We introduce Iterative Agent Decoding (IAD), a procedure that repeatedly inserts feedback extracted from different forms of critiques.
arXiv Detail & Related papers (2025-04-02T17:40:47Z) - LREF: A Novel LLM-based Relevance Framework for E-commerce [14.217396055372053]
This paper proposes a novel framework called the LLM-based RElevance Framework (LREF) aimed at enhancing e-commerce search relevance.<n>We evaluate the performance of the framework through a series of offline experiments on large-scale real-world datasets, as well as online A/B testing.<n>The model was deployed in a well-known e-commerce application, yielding substantial commercial benefits.
arXiv Detail & Related papers (2025-03-12T10:10:30Z) - Client-Centric Federated Adaptive Optimization [78.30827455292827]
Federated Learning (FL) is a distributed learning paradigm where clients collaboratively train a model while keeping their own data private.<n>We propose Federated-Centric Adaptive Optimization, which is a class of novel federated optimization approaches.
arXiv Detail & Related papers (2025-01-17T04:00:50Z) - Margin Matching Preference Optimization: Enhanced Model Alignment with Granular Feedback [64.67540769692074]
Large language models (LLMs) fine-tuned with alignment techniques, such as reinforcement learning from human feedback, have been instrumental in developing some of the most capable AI systems to date.<n>We introduce an approach called Margin Matching Preference Optimization (MMPO), which incorporates relative quality margins into optimization, leading to improved LLM policies and reward models.<n>Experiments with both human and AI feedback data demonstrate that MMPO consistently outperforms baseline methods, often by a substantial margin, on popular benchmarks including MT-bench and RewardBench.
arXiv Detail & Related papers (2024-10-04T04:56:11Z) - Generative Pre-trained Ranking Model with Over-parameterization at Web-Scale (Extended Abstract) [73.57710917145212]
Learning to rank is widely employed in web searches to prioritize pertinent webpages based on input queries.
We propose a emphulineGenerative ulineSemi-ulineSupervised ulinePre-trained (GS2P) model to address these challenges.
We conduct extensive offline experiments on both a publicly available dataset and a real-world dataset collected from a large-scale search engine.
arXiv Detail & Related papers (2024-09-25T03:39:14Z) - Optimizing E-commerce Search: Toward a Generalizable and Rank-Consistent Pre-Ranking Model [13.573766789458118]
In large e-commerce platforms, the pre-ranking phase is crucial for filtering out the bulk of products in advance for the downstream ranking module.
We propose a novel method: a Generalizable and RAnk-ConsistEnt Pre-Ranking Model (GRACE), which achieves: 1) Ranking consistency by introducing multiple binary classification tasks that predict whether a product is within the top-k results as estimated by the ranking model, which facilitates the addition of learning objectives on common point-wise ranking models; 2) Generalizability through contrastive learning of representation for all products by pre-training on a subset of ranking product embeddings
arXiv Detail & Related papers (2024-05-09T07:55:52Z) - Improving Text Matching in E-Commerce Search with A Rationalizable,
Intervenable and Fast Entity-Based Relevance Model [78.80174696043021]
We propose a novel model called the Entity-Based Relevance Model (EBRM)
The decomposition allows us to use a Cross-encoder QE relevance module for high accuracy.
We also show that pretraining the QE module with auto-generated QE data from user logs can further improve the overall performance.
arXiv Detail & Related papers (2023-07-01T15:44:53Z) - PreSizE: Predicting Size in E-Commerce using Transformers [76.33790223551074]
PreSizE is a novel deep learning framework which utilizes Transformers for accurate size prediction.
We demonstrate that PreSizE is capable of achieving superior prediction performance compared to previous state-of-the-art baselines.
As a proof of concept, we demonstrate that size predictions made by PreSizE can be effectively integrated into an existing production recommender system.
arXiv Detail & Related papers (2021-05-04T15:23:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.