Related papers: NeuralSearchX: Serving a Multi-billion-parameter Reranker for Multilingual Metasearch at a Low Cost

NeuralSearchX: Serving a Multi-billion-parameter Reranker for Multilingual Metasearch at a Low Cost

URL: http://arxiv.org/abs/2210.14837v1
Date: Wed, 26 Oct 2022 16:36:53 GMT
Title: NeuralSearchX: Serving a Multi-billion-parameter Reranker for Multilingual Metasearch at a Low Cost
Authors: Thales Sales Almeida, Thiago Laitz, Jo\~ao Ser\'odio, Luiz Henrique Bonifacio, Roberto Lotufo, Rodrigo Nogueira
Abstract summary: We describe NeuralSearchX, a metasearch engine based on a multi-purpose large reranking model to merge results and highlight sentences. We show that our design choices led to a much cost-effective system with competitive QPS while having close to state-of-the-art results on a wide range of public benchmarks.
Score: 4.186775801993103
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The widespread availability of search API's (both free and commercial) brings the promise of increased coverage and quality of search results for metasearch engines, while decreasing the maintenance costs of the crawling and indexing infrastructures. However, merging strategies frequently comprise complex pipelines that require careful tuning, which is often overlooked in the literature. In this work, we describe NeuralSearchX, a metasearch engine based on a multi-purpose large reranking model to merge results and highlight sentences. Due to the homogeneity of our architecture, we could focus our optimization efforts on a single component. We compare our system with Microsoft's Biomedical Search and show that our design choices led to a much cost-effective system with competitive QPS while having close to state-of-the-art results on a wide range of public benchmarks. Human evaluation on two domain-specific tasks shows that our retrieval system outperformed Google API by a large margin in terms of nDCG@10 scores. By describing our architecture and implementation in detail, we hope that the community will build on our design choices. The system is available at https://neuralsearchx.nsx.ai.

Related papers

Benchmarking Deep Search over Heterogeneous Enterprise Data [73.55304268238474]
We present a new benchmark for evaluating a form of retrieval-augmented generation (RAG)<n>RAG requires source-aware, multi-hop reasoning over diverse, sparsed, but related sources.<n>We build it using a synthetic data pipeline that simulates business across product planning, development, and support stages.
arXiv Detail & Related papers (2025-06-29T08:34:59Z)
DeepResearchGym: A Free, Transparent, and Reproducible Evaluation Sandbox for Deep Research [25.368303145176554]
DeepResearchGym is an open-source sandbox that combines a search API with a rigorous evaluation protocol for benchmarking deep research systems.<n>The API indexes large-scale public web corpora, namely ClueWeb22 and FineWeb, using a state-of-the-art dense retriever and approximate nearest neighbor search via DiskANN.<n>It achieves lower latency than popular commercial APIs while ensuring stable document rankings across runs, and is freely available for research use.
arXiv Detail & Related papers (2025-05-25T18:16:13Z)
Pareto-aware Neural Architecture Generation for Diverse Computational Budgets [94.27982238384847]
Existing methods often perform an independent architecture search process for each target budget. We propose a Neural Architecture Generator (PNAG) which only needs to be trained once and dynamically produces the optimal architecture for any given budget via inference. Such a joint search algorithm not only greatly reduces the overall search cost but also improves the results.
arXiv Detail & Related papers (2022-10-14T08:30:59Z)
Machine Translation Decoding beyond Beam Search [43.27883368285612]
Beam search is the go-to method for decoding auto-regressive machine translation models. Our aim is to establish whether beam search can be replaced by a more powerful metric-driven search technique. We introduce a Monte-Carlo Tree Search (MCTS) based method and showcase its competitiveness.
arXiv Detail & Related papers (2021-04-12T10:28:17Z)
One-Shot Neural Ensemble Architecture Search by Diversity-Guided Search Space Shrinking [97.60915598958968]
We propose a one-shot neural ensemble architecture search (NEAS) solution that addresses the two challenges. For the first challenge, we introduce a novel diversity-based metric to guide search space shrinking. For the second challenge, we enable a new search dimension to learn layer sharing among different models for efficiency purposes.
arXiv Detail & Related papers (2021-04-01T16:29:49Z)
Towards Improving the Consistency, Efficiency, and Flexibility of Differentiable Neural Architecture Search [84.4140192638394]
Most differentiable neural architecture search methods construct a super-net for search and derive a target-net as its sub-graph for evaluation. In this paper, we introduce EnTranNAS that is composed of Engine-cells and Transit-cells. Our method also spares much memory and computation cost, which speeds up the search process.
arXiv Detail & Related papers (2021-01-27T12:16:47Z)
DiffMG: Differentiable Meta Graph Search for Heterogeneous Graph Neural Networks [45.075163625895286]
We search for a meta graph, which can capture more complex semantic relations than a meta path, to determine how graph neural networks propagate messages along different types of edges. We design an expressive search space in the form of a directed acyclic graph (DAG) to represent candidate meta graphs for a HIN. We propose a novel and efficient search algorithm to make the total search cost on a par with training a single GNN once.
arXiv Detail & Related papers (2020-10-07T08:09:29Z)
MS-RANAS: Multi-Scale Resource-Aware Neural Architecture Search [94.80212602202518]
We propose Multi-Scale Resource-Aware Neural Architecture Search (MS-RANAS) We employ a one-shot architecture search approach in order to obtain a reduced search cost. We achieve state-of-the-art results in terms of accuracy-speed trade-off.
arXiv Detail & Related papers (2020-09-29T11:56:01Z)
Fine-Grained Stochastic Architecture Search [6.277767522867666]
Fine-Grained Architecture Search (FiGS) is a differentiable search method that searches over a much larger set of candidate architectures. FiGS simultaneously selects and modifies operators in the search space by applying a structured sparse regularization penalty. We show results across 3 existing search spaces, matching or outperforming the original search algorithms.
arXiv Detail & Related papers (2020-06-17T01:04:14Z)
Deep-n-Cheap: An Automated Search Framework for Low Complexity Deep Learning [3.479254848034425]
We present Deep-n-Cheap -- an open-source AutoML framework to search for deep learning models. Our framework is targeted for deployment on both benchmark and custom datasets. Deep-n-Cheap includes a user-customizable complexity penalty which trades off performance with training time or number of parameters.
arXiv Detail & Related papers (2020-03-27T13:00:21Z)
AutoSTR: Efficient Backbone Search for Scene Text Recognition [80.7290173000068]
Scene text recognition (STR) is very challenging due to the diversity of text instances and the complexity of scenes. We propose automated STR (AutoSTR) to search data-dependent backbones to boost text recognition performance. Experiments demonstrate that, by searching data-dependent backbones, AutoSTR can outperform the state-of-the-art approaches on standard benchmarks.
arXiv Detail & Related papers (2020-03-14T06:51:04Z)
NAS-Count: Counting-by-Density with Neural Architecture Search [74.92941571724525]
We automate the design of counting models with Neural Architecture Search (NAS) We introduce an end-to-end searched encoder-decoder architecture, Automatic Multi-Scale Network (AMSNet)
arXiv Detail & Related papers (2020-02-29T09:18:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.