Self-Consistent Decoding for More Factual Open Responses
- URL: http://arxiv.org/abs/2403.00696v1
- Date: Fri, 1 Mar 2024 17:31:09 GMT
- Title: Self-Consistent Decoding for More Factual Open Responses
- Authors: Christopher Malon and Xiaodan Zhu
- Abstract summary: "Sample & Select" improves factuality by a 30% relative margin against decoders of DoLA, P-CRR, and S-CRR.
We collect human verifications of the generated summaries, confirming the factual superiority of our method.
- Score: 28.184313177333642
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Self-consistency has emerged as a powerful method for improving the accuracy
of short answers generated by large language models. As previously defined, it
only concerns the accuracy of a final answer parsed from generated text. In
this work, we extend the idea to open response generation, by integrating
voting into the decoding method. Each output sentence is selected from among
multiple samples, conditioning on the previous selections, based on a simple
token overlap score. We compare this "Sample & Select" method to greedy
decoding, beam search, nucleus sampling, and the recently introduced
hallucination avoiding decoders of DoLA, P-CRR, and S-CRR. We show that Sample
& Select improves factuality by a 30% relative margin against these decoders in
NLI-based evaluation on the subsets of CNN/DM and XSum used in the FRANK
benchmark, while maintaining comparable ROUGE-1 F1 scores against reference
summaries. We collect human verifications of the generated summaries,
confirming the factual superiority of our method.
Related papers
- Balancing Diversity and Risk in LLM Sampling: How to Select Your Method and Parameter for Open-Ended Text Generation [60.493180081319785]
We propose a systematic way to estimate the intrinsic capacity of a truncation sampling method by considering the trade-off between diversity and risk at each decoding step.
Our work provides a comprehensive comparison between existing truncation sampling methods, as well as their recommended parameters as a guideline for users.
arXiv Detail & Related papers (2024-08-24T14:14:32Z) - EEL: Efficiently Encoding Lattices for Reranking [44.77383151122229]
We use Transformers to efficiently encode lattices of generated outputs.
We combine this approach with a new class of token-factored rerankers (TFRs)
Our results show both substantial speedup compared to naive reranking and often better performance on downstream metrics than comparable approaches.
arXiv Detail & Related papers (2023-06-01T17:45:32Z) - Attributable and Scalable Opinion Summarization [79.87892048285819]
We generate abstractive summaries by decoding frequent encodings, and extractive summaries by selecting the sentences assigned to the same frequent encodings.
Our method is attributable, because the model identifies sentences used to generate the summary as part of the summarization process.
It scales easily to many hundreds of input reviews, because aggregation is performed in the latent space rather than over long sequences of tokens.
arXiv Detail & Related papers (2023-05-19T11:30:37Z) - Classifiers are Better Experts for Controllable Text Generation [63.17266060165098]
We show that the proposed method significantly outperforms recent PPLM, GeDi, and DExperts on PPL and sentiment accuracy based on the external classifier of generated texts.
The same time, it is also easier to implement and tune, and has significantly fewer restrictions and requirements.
arXiv Detail & Related papers (2022-05-15T12:58:35Z) - A Well-Composed Text is Half Done! Composition Sampling for Diverse
Conditional Generation [79.98319703471596]
We propose Composition Sampling, a simple but effective method to generate diverse outputs for conditional generation of higher quality.
It builds on recently proposed plan-based neural generation models that are trained to first create a composition of the output and then generate by conditioning on it and the input.
arXiv Detail & Related papers (2022-03-28T21:24:03Z) - An Evaluation Study of Generative Adversarial Networks for Collaborative
Filtering [75.83628561622287]
This work successfully replicates the results published in the original paper and discusses the impact of certain differences between the CFGAN framework and the model used in the original evaluation.
The work further expands the experimental analysis comparing CFGAN against a selection of simple and well-known properly optimized baselines, observing that CFGAN is not consistently competitive against them despite its high computational cost.
arXiv Detail & Related papers (2022-01-05T20:53:27Z) - Show Me How To Revise: Improving Lexically Constrained Sentence
Generation with XLNet [27.567493727582736]
We propose a two-step approach, "Predict and Revise", for constrained sentence generation.
During the predict step, we leveraged the classifier to compute the learned prior for the candidate sentence.
During the revise step, we resorted to MCMC sampling to revise the candidate sentence by conducting a sampled action at a sampled position drawn from the learned prior.
Experimental results have demonstrated that our proposed model performs much better than the previous work in terms of sentence fluency and diversity.
arXiv Detail & Related papers (2021-09-13T09:21:07Z) - Convex Aggregation for Opinion Summarization [18.753472191533515]
An encoder-decoder model is trained to reconstruct single reviews and learns a latent review encoding space.
At summarization time, the unweighted average of latent review vectors is decoded into a summary.
We propose Coop, a convex vector aggregation framework for opinion summarization, that searches for better combinations of input reviews.
arXiv Detail & Related papers (2021-04-03T10:52:14Z) - Self-Adversarial Learning with Comparative Discrimination for Text
Generation [111.18614166615968]
We propose a novel self-adversarial learning (SAL) paradigm for improving GANs' performance in text generation.
During training, SAL rewards the generator when its currently generated sentence is found to be better than its previously generated samples.
Experiments on text generation benchmark datasets show that our proposed approach substantially improves both the quality and the diversity.
arXiv Detail & Related papers (2020-01-31T07:50:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.