Related papers: Constructing Tree-based Index for Efficient and Effective Dense Retrieval

Constructing Tree-based Index for Efficient and Effective Dense Retrieval

URL: http://arxiv.org/abs/2304.11943v1
Date: Mon, 24 Apr 2023 09:25:39 GMT
Title: Constructing Tree-based Index for Efficient and Effective Dense Retrieval
Authors: Haitao Li, Qingyao Ai, Jingtao Zhan, Jiaxin Mao, Yiqun Liu, Zheng Liu, Zhao Cao
Abstract summary: JTR stands for Joint optimization of TRee-based index and query encoding. We design a new unified contrastive learning loss to train tree-based index and query encoder in an end-to-end manner. Experimental results show that JTR achieves better retrieval performance while retaining high system efficiency.
Score: 26.706985694158384
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent studies have shown that Dense Retrieval (DR) techniques can significantly improve the performance of first-stage retrieval in IR systems. Despite its empirical effectiveness, the application of DR is still limited. In contrast to statistic retrieval models that rely on highly efficient inverted index solutions, DR models build dense embeddings that are difficult to be pre-processed with most existing search indexing systems. To avoid the expensive cost of brute-force search, the Approximate Nearest Neighbor (ANN) algorithm and corresponding indexes are widely applied to speed up the inference process of DR models. Unfortunately, while ANN can improve the efficiency of DR models, it usually comes with a significant price on retrieval performance. To solve this issue, we propose JTR, which stands for Joint optimization of TRee-based index and query encoding. Specifically, we design a new unified contrastive learning loss to train tree-based index and query encoder in an end-to-end manner. The tree-based negative sampling strategy is applied to make the tree have the maximum heap property, which supports the effectiveness of beam search well. Moreover, we treat the cluster assignment as an optimization problem to update the tree-based index that allows overlapped clustering. We evaluate JTR on numerous popular retrieval benchmarks. Experimental results show that JTR achieves better retrieval performance while retaining high system efficiency compared with widely-adopted baselines. It provides a potential solution to balance efficiency and effectiveness in neural retrieval system designs.

Related papers

Optimizing Retrieval Augmented Generation for Object Constraint Language [3.4777703321218225]
OCL is essential for Model-Based Systems Engineering (MBSE) but manually writing OCL rules is complex and time-consuming.<n>We evaluate the impact of three different retrieval strategies on $OCLBERT generation.<n>We show that while retrieval can enhance generation accuracy, its effectiveness depends on the retrieval method and the number of retrieved chunks.
arXiv Detail & Related papers (2025-05-19T14:00:10Z)
Dynamic Parallel Tree Search for Efficient LLM Reasoning [102.16694475391665]
Tree of Thoughts (ToT) enhances Large Language Model (LLM) reasoning by structuring problem-solving as a spanning tree. We propose Dynamic Parallel Tree Search (DPTS), a novel parallelism framework that aims to dynamically optimize the reasoning path in inference. Experiments on Qwen-2.5 and Llama-3 with Math500 and GSM8K datasets show that DPTS significantly improves efficiency by 2-4x on average.
arXiv Detail & Related papers (2025-02-22T14:13:37Z)
Learning Deep Tree-based Retriever for Efficient Recommendation: Theory and Method [76.31185707649227]
We propose a Deep Tree-based Retriever (DTR) for efficient recommendation. DTR frames the training task as a softmax-based multi-class classification over tree nodes at the same level. To mitigate the suboptimality induced by the labeling of non-leaf nodes, we propose a rectification method for the loss function.
arXiv Detail & Related papers (2024-08-21T05:09:53Z)
LiteSearch: Efficacious Tree Search for LLM [70.29796112457662]
This study introduces a novel guided tree search algorithm with dynamic node selection and node-level exploration budget. Experiments conducted on the GSM8K and TabMWP datasets demonstrate that our approach enjoys significantly lower computational costs compared to baseline methods.
arXiv Detail & Related papers (2024-06-29T05:14:04Z)
DeeperImpact: Optimizing Sparse Learned Index Structures [4.92919246305126]
We focus on narrowing the effectiveness gap with the most effective versions of SPLADE. Our results substantially narrow the effectiveness gap with the most effective versions of SPLADE.
arXiv Detail & Related papers (2024-05-27T12:08:59Z)
Semi-Parametric Retrieval via Binary Bag-of-Tokens Index [71.78109794895065]
SemI-parametric Disentangled Retrieval (SiDR) is a bi-encoder retrieval framework that decouples retrieval index from neural parameters. SiDR supports a non-parametric tokenization index for search, achieving BM25-like indexing complexity with significantly better effectiveness.
arXiv Detail & Related papers (2024-05-03T08:34:13Z)
Efficient Architecture Search via Bi-level Data Pruning [70.29970746807882]
This work pioneers an exploration into the critical role of dataset characteristics for DARTS bi-level optimization. We introduce a new progressive data pruning strategy that utilizes supernet prediction dynamics as the metric. Comprehensive evaluations on the NAS-Bench-201 search space, DARTS search space, and MobileNet-like search space validate that BDP reduces search costs by over 50%.
arXiv Detail & Related papers (2023-12-21T02:48:44Z)
Efficient Non-Parametric Optimizer Search for Diverse Tasks [93.64739408827604]
We present the first efficient scalable and general framework that can directly search on the tasks of interest. Inspired by the innate tree structure of the underlying math expressions, we re-arrange the spaces into a super-tree. We adopt an adaptation of the Monte Carlo method to tree search, equipped with rejection sampling and equivalent- form detection.
arXiv Detail & Related papers (2022-09-27T17:51:31Z)
Cascaded Fast and Slow Models for Efficient Semantic Code Search [46.53530668938728]
We propose an efficient and accurate semantic code search framework with cascaded fast and slow models. The proposed cascaded approach is not only efficient and scalable, but also achieves state-of-the-art results.
arXiv Detail & Related papers (2021-10-15T02:23:35Z)
Learning to Hash Robustly, with Guarantees [79.68057056103014]
In this paper, we design an NNS algorithm for the Hamming space that has worst-case guarantees essentially matching that of theoretical algorithms. We evaluate the algorithm's ability to optimize for a given dataset both theoretically and practically. Our algorithm has a 1.8x and 2.1x better recall on the worst-performing queries to the MNIST and ImageNet datasets.
arXiv Detail & Related papers (2021-08-11T20:21:30Z)
CARMI: A Cache-Aware Learned Index with a Cost-based Construction Algorithm [1.9798034349981157]
We propose a cache-aware learned index (CARMI) design to improve the efficiency of the Recursive Model Index (RMI) framework. We formulate the problem of finding the optimal design of a learned index as an optimization problem and propose a dynamic programming algorithm for solving it. Experiments show that our index construction strategy can construct indexes with significantly better performance compared to baselines.
arXiv Detail & Related papers (2021-03-01T09:20:53Z)
Off-Policy Reinforcement Learning for Efficient and Effective GAN Architecture Search [50.40004966087121]
We introduce a new reinforcement learning based neural architecture search (NAS) methodology for generative adversarial network (GAN) architecture search. The key idea is to formulate the GAN architecture search problem as a Markov decision process (MDP) for smoother architecture sampling. We exploit an off-policy GAN architecture search algorithm that makes efficient use of the samples generated by previous policies.
arXiv Detail & Related papers (2020-07-17T18:29:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.