Determinantal Beam Search
- URL: http://arxiv.org/abs/2106.07400v4
- Date: Fri, 23 Jun 2023 05:52:22 GMT
- Title: Determinantal Beam Search
- Authors: Clara Meister, Martina Forster, Ryan Cotterell
- Abstract summary: Beam search is a go-to strategy for decoding neural sequence models.
In use-cases that call for multiple solutions, a diverse or representative set is often desired.
By posing iterations in beam search as a series of subdeterminant problems, we can turn the algorithm into a diverse subset selection process.
- Score: 75.84501052642361
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Beam search is a go-to strategy for decoding neural sequence models. The
algorithm can naturally be viewed as a subset optimization problem, albeit one
where the corresponding set function does not reflect interactions between
candidates. Empirically, this leads to sets often exhibiting high overlap,
e.g., strings may differ by only a single word. Yet in use-cases that call for
multiple solutions, a diverse or representative set is often desired. To
address this issue, we propose a reformulation of beam search, which we call
determinantal beam search. Determinantal beam search has a natural relationship
to determinantal point processes (DPPs), models over sets that inherently
encode intra-set interactions. By posing iterations in beam search as a series
of subdeterminant maximization problems, we can turn the algorithm into a
diverse subset selection process. In a case study, we use the string
subsequence kernel to explicitly encourage n-gram coverage in text generated
from a sequence model. We observe that our algorithm offers competitive
performance against other diverse set generation strategies in the context of
language generation, while providing a more general approach to optimizing for
diversity.
Related papers
- A Three-Stage Algorithm for the Closest String Problem on Artificial and Real Gene Sequences [39.58317527488534]
Closest String Problem is an NP-hard problem that aims to find a string that has the minimum distance from all sequences that belong to the given set of strings.
In this paper, we introduce a three-stage algorithm that comprises the following process: first, we apply a novel alphabet pruning method to reduce the search space for effectively finding promising search regions.
Second, a variant of beam search to find a solution is employed. This method utilizes a newly developed guiding function based on an expected distance score of partial solutions.
arXiv Detail & Related papers (2024-07-17T21:26:27Z) - A Non-monotonic Self-terminating Language Model [62.93465126911921]
In this paper, we focus on the problem of non-terminating sequences resulting from an incomplete decoding algorithm.
We first define an incomplete probable decoding algorithm which includes greedy search, top-$k$ sampling, and nucleus sampling.
We then propose a non-monotonic self-terminating language model, which relaxes the constraint of monotonically increasing termination probability.
arXiv Detail & Related papers (2022-10-03T00:28:44Z) - Massive-scale Decoding for Text Generation using Lattices [34.2658286826597]
We present a search algorithm to construct lattices encoding a massive number of generation options.
We show that our algorithm encodes hundreds to thousands of diverse options that remain grammatical and high-quality into one linear-sized lattice.
arXiv Detail & Related papers (2021-12-14T18:56:11Z) - Towards Deterministic Diverse Subset Sampling [14.236193187116049]
In this paper, we discuss a greedy deterministic adaptation of k-DPP.
We demonstrate the usefulness of the model on an image search task.
arXiv Detail & Related papers (2021-05-28T16:05:58Z) - Best-First Beam Search [78.71330480725668]
We show that the standard implementation of beam search can be made up to 10x faster in practice.
We propose a memory-reduced variant of Best-First Beam Search, which has a similar beneficial search bias in terms of downstream performance.
arXiv Detail & Related papers (2020-07-08T05:56:01Z) - Consistency of a Recurrent Language Model With Respect to Incomplete
Decoding [67.54760086239514]
We study the issue of receiving infinite-length sequences from a recurrent language model.
We propose two remedies which address inconsistency: consistent variants of top-k and nucleus sampling, and a self-terminating recurrent language model.
arXiv Detail & Related papers (2020-02-06T19:56:15Z) - Extreme Algorithm Selection With Dyadic Feature Representation [78.13985819417974]
We propose the setting of extreme algorithm selection (XAS) where we consider fixed sets of thousands of candidate algorithms.
We assess the applicability of state-of-the-art AS techniques to the XAS setting and propose approaches leveraging a dyadic feature representation.
arXiv Detail & Related papers (2020-01-29T09:40:58Z) - Optimal Clustering from Noisy Binary Feedback [75.17453757892152]
We study the problem of clustering a set of items from binary user feedback.
We devise an algorithm with a minimal cluster recovery error rate.
For adaptive selection, we develop an algorithm inspired by the derivation of the information-theoretical error lower bounds.
arXiv Detail & Related papers (2019-10-14T09:18:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.