Best-$k$ Search Algorithm for Neural Text Generation
- URL: http://arxiv.org/abs/2211.11924v1
- Date: Tue, 22 Nov 2022 00:26:13 GMT
- Title: Best-$k$ Search Algorithm for Neural Text Generation
- Authors: Jiacheng Xu, Caiming Xiong, Silvio Savarese, Yingbo Zhou
- Abstract summary: We propose a deterministic search algorithm balancing both quality and diversity.
The proposed algorithm is parameter-free, lightweight, efficient, and easy to use.
- Score: 118.02691398555781
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modern natural language generation paradigms require a good decoding strategy
to obtain quality sequences out of the model. Beam search yields high-quality
but low diversity outputs; stochastic approaches suffer from high variance and
sometimes low quality, but the outputs tend to be more natural and creative. In
this work, we propose a deterministic search algorithm balancing both quality
and diversity. We first investigate the vanilla best-first search (BFS)
algorithm and then propose the Best-$k$ Search algorithm. Inspired by BFS, we
greedily expand the top $k$ nodes, instead of only the first node, to boost
efficiency and diversity. Upweighting recently discovered nodes accompanied by
heap pruning ensures the completeness of the search procedure. Experiments on
four NLG tasks, including question generation, commonsense generation, text
summarization, and translation, show that best-$k$ search yields more diverse
and natural outputs compared to strong baselines, while our approach maintains
high text quality. The proposed algorithm is parameter-free, lightweight,
efficient, and easy to use.
Related papers
- A Three-Stage Algorithm for the Closest String Problem on Artificial and Real Gene Sequences [39.58317527488534]
Closest String Problem is an NP-hard problem that aims to find a string that has the minimum distance from all sequences that belong to the given set of strings.
In this paper, we introduce a three-stage algorithm that comprises the following process: first, we apply a novel alphabet pruning method to reduce the search space for effectively finding promising search regions.
Second, a variant of beam search to find a solution is employed. This method utilizes a newly developed guiding function based on an expected distance score of partial solutions.
arXiv Detail & Related papers (2024-07-17T21:26:27Z) - LiteSearch: Efficacious Tree Search for LLM [70.29796112457662]
This study introduces a novel guided tree search algorithm with dynamic node selection and node-level exploration budget.
Experiments conducted on the GSM8K and TabMWP datasets demonstrate that our approach enjoys significantly lower computational costs compared to baseline methods.
arXiv Detail & Related papers (2024-06-29T05:14:04Z) - Massive-scale Decoding for Text Generation using Lattices [34.2658286826597]
We present a search algorithm to construct lattices encoding a massive number of generation options.
We show that our algorithm encodes hundreds to thousands of diverse options that remain grammatical and high-quality into one linear-sized lattice.
arXiv Detail & Related papers (2021-12-14T18:56:11Z) - DISCO : efficient unsupervised decoding for discrete natural language
problems via convex relaxation [1.370633147306388]
We study test time decoding; an ubiquitous step in almost all sequential text generation task spanning across a wide array of natural language processing (NLP) problems.
Our main contribution is to develop a continuous relaxation framework for the NP-hard decoding problem and propose Disco - an efficient algorithm based on standard first order gradient based.
arXiv Detail & Related papers (2021-07-07T00:40:25Z) - Determinantal Beam Search [75.84501052642361]
Beam search is a go-to strategy for decoding neural sequence models.
In use-cases that call for multiple solutions, a diverse or representative set is often desired.
By posing iterations in beam search as a series of subdeterminant problems, we can turn the algorithm into a diverse subset selection process.
arXiv Detail & Related papers (2021-06-14T13:01:46Z) - Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models.
The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning.
We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z) - Quality-Diversity Optimization: a novel branch of stochastic
optimization [5.677685109155078]
Multimodal optimization algorithms search for the highest peaks in the search space that can be more than one.
Quality-Diversity algorithms are a recent addition to the evolutionary computation toolbox that do not only search for a single set of local optima, but instead try to illuminate the search space.
arXiv Detail & Related papers (2020-12-08T09:52:50Z) - A Systematic Characterization of Sampling Algorithms for Open-ended
Language Generation [71.31905141672529]
We study the widely adopted ancestral sampling algorithms for auto-regressive language models.
We identify three key properties that are shared among them: entropy reduction, order preservation, and slope preservation.
We find that the set of sampling algorithms that satisfies these properties performs on par with the existing sampling algorithms.
arXiv Detail & Related papers (2020-09-15T17:28:42Z) - Best-First Beam Search [78.71330480725668]
We show that the standard implementation of beam search can be made up to 10x faster in practice.
We propose a memory-reduced variant of Best-First Beam Search, which has a similar beneficial search bias in terms of downstream performance.
arXiv Detail & Related papers (2020-07-08T05:56:01Z) - Trading Off Diversity and Quality in Natural Language Generation [12.672685374008259]
We cast decoding as a multi-objective optimization problem aiming to simultaneously maximize both response quality and diversity.
Our framework enables us to perform the first large-scale evaluation of decoding methods along the entire quality-diversity spectrum.
We leverage our findings to create and evaluate an algorithm called emphselective sampling which tractably approximates globally-normalized temperature sampling.
arXiv Detail & Related papers (2020-04-22T09:12:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.