MOBIUS: Towards the Next Generation of Query-Ad Matching in Baidu's Sponsored Search
- URL: http://arxiv.org/abs/2409.03449v1
- Date: Thu, 05 Sep 2024 11:56:40 GMT
- Title: MOBIUS: Towards the Next Generation of Query-Ad Matching in Baidu's Sponsored Search
- Authors: Miao Fan, Jiacheng Guo, Shuai Zhu, Shuo Miao, Mingming Sun, Ping Li,
- Abstract summary: Mobius project aims to train the matching layer to consider CPM as an additional optimization objective besides the query-ad relevance.
This paper will elaborate on how we adopt active learning to overcome the insufficiency of click history at the matching layer.
We contribute the solutions to Mobius-V1 as the first version of our next generation query-ad matching system.
- Score: 27.752810150893552
- License:
- Abstract: Baidu runs the largest commercial web search engine in China, serving hundreds of millions of online users every day in response to a great variety of queries. In order to build a high-efficiency sponsored search engine, we used to adopt a three-layer funnel-shaped structure to screen and sort hundreds of ads from billions of ad candidates subject to the requirement of low response latency and the restraints of computing resources. Given a user query, the top matching layer is responsible for providing semantically relevant ad candidates to the next layer, while the ranking layer at the bottom concerns more about business indicators (e.g., CPM, ROI, etc.) of those ads. The clear separation between the matching and ranking objectives results in a lower commercial return. The Mobius project has been established to address this serious issue. It is our first attempt to train the matching layer to consider CPM as an additional optimization objective besides the query-ad relevance, via directly predicting CTR (click-through rate) from billions of query-ad pairs. Specifically, this paper will elaborate on how we adopt active learning to overcome the insufficiency of click history at the matching layer when training our neural click networks offline, and how we use the SOTA ANN search technique for retrieving ads more efficiently (Here ``ANN'' stands for approximate nearest neighbor search). We contribute the solutions to Mobius-V1 as the first version of our next generation query-ad matching system.
Related papers
- MM-Embed: Universal Multimodal Retrieval with Multimodal LLMs [78.5013630951288]
This paper introduces techniques for advancing information retrieval with multimodal large language models (MLLMs)
We first study fine-tuning an MLLM as a bi-encoder retriever on 10 datasets with 16 retrieval tasks.
We propose modality-aware hard negative mining to mitigate the modality bias exhibited by MLLM retrievers.
arXiv Detail & Related papers (2024-11-04T20:06:34Z) - Tree Search for Language Model Agents [69.43007235771383]
We propose an inference-time search algorithm for LM agents to perform exploration and multi-step planning in interactive web environments.
Our approach is a form of best-first tree search that operates within the actual environment space.
It is the first tree search algorithm for LM agents that shows effectiveness on realistic web tasks.
arXiv Detail & Related papers (2024-07-01T17:07:55Z) - ACE: A Generative Cross-Modal Retrieval Framework with Coarse-To-Fine Semantic Modeling [53.97609687516371]
We propose a pioneering generAtive Cross-modal rEtrieval framework (ACE) for end-to-end cross-modal retrieval.
ACE achieves state-of-the-art performance in cross-modal retrieval and outperforms the strong baselines on Recall@1 by 15.27% on average.
arXiv Detail & Related papers (2024-06-25T12:47:04Z) - Scaling Up LLM Reviews for Google Ads Content Moderation [22.43127685744644]
Large language models (LLMs) are powerful tools for content moderation, but their inference costs and latency make them prohibitive for casual use on large datasets.
This study proposes a method for scaling up LLM reviews for content in Google Ads.
arXiv Detail & Related papers (2024-02-07T23:47:02Z) - Autonomous Tree-search Ability of Large Language Models [58.68735916408101]
Large Language Models have excelled in remarkable reasoning capabilities with advanced prompting techniques.
Recent works propose to utilize external programs to define search logic, such that LLMs can perform passive tree search to solve more challenging reasoning tasks.
We propose a new concept called autonomous tree-search ability of LLM, which can automatically generate a response containing search trajectories for the correct answer.
arXiv Detail & Related papers (2023-10-14T14:14:38Z) - UniKGQA: Unified Retrieval and Reasoning for Solving Multi-hop Question
Answering Over Knowledge Graph [89.98762327725112]
Multi-hop Question Answering over Knowledge Graph(KGQA) aims to find the answer entities that are multiple hops away from the topic entities mentioned in a natural language question.
We propose UniKGQA, a novel approach for multi-hop KGQA task, by unifying retrieval and reasoning in both model architecture and parameter learning.
arXiv Detail & Related papers (2022-12-02T04:08:09Z) - Tree-based Text-Vision BERT for Video Search in Baidu Video Advertising [58.09698019028931]
How to pair the video ads with the user search is the core task of Baidu video advertising.
Due to the modality gap, the query-to-video retrieval is much more challenging than traditional query-to-document retrieval.
We present a tree-based combo-attention network (TCAN) which has been recently launched in Baidu's dynamic video advertising platform.
arXiv Detail & Related papers (2022-09-19T04:49:51Z) - Online Bidding Algorithms for Return-on-Spend Constrained Advertisers [10.500109788348732]
This work explores efficient online algorithms for a single value-maximizing advertiser under an increasingly popular constraint: Return-on-Spend (RoS)
We contribute a simple online algorithm that achieves near-optimal regret in expectation while always respecting the specified RoS constraint.
arXiv Detail & Related papers (2022-08-29T16:49:24Z) - Diversity driven Query Rewriting in Search Advertising [1.5289756643078838]
generative retrieval models have been shown to be effective at the task of generating such query rewrites.
We introduce CLOVER, a framework to generate both high-quality and diverse rewrites.
We empirically show the effectiveness of our proposed approach through offline experiments on search queries across geographies spanning three major languages.
arXiv Detail & Related papers (2021-06-07T17:30:45Z) - Optimizing AD Pruning of Sponsored Search with Reinforcement Learning [14.583308909225552]
Industrial sponsored search system (SSS) can be logically divided into three modules: keywords matching, ad retrieving, and ranking.
The problem we are going to address is: how to pick out the best $K$ items from $N$ candidates to maximize the system's revenue.
We propose a novel model-free reinforcement learning approach to fixing this problem.
arXiv Detail & Related papers (2020-08-05T09:19:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.