Related papers: DSPy Assertions: Computational Constraints for Self-Refining Language Model Pipelines

DSPy Assertions: Computational Constraints for Self-Refining Language Model Pipelines

URL: http://arxiv.org/abs/2312.13382v2
Date: Fri, 2 Feb 2024 18:20:03 GMT
Title: DSPy Assertions: Computational Constraints for Self-Refining Language Model Pipelines
Authors: Arnav Singhvi, Manish Shetty, Shangyin Tan, Christopher Potts, Koushik Sen, Matei Zaharia, Omar Khattab
Abstract summary: Chaining language model (LM) calls as composable modules is fueling a new way of programming. We introduce LM Assertions, a construct for expressing computational constraints that LMs should satisfy. We present new strategies that allow DSPy to compile programs with LM Assertions into more reliable and accurate systems.
Score: 41.779902953557425
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Chaining language model (LM) calls as composable modules is fueling a new way of programming, but ensuring LMs adhere to important constraints requires heuristic "prompt engineering". We introduce LM Assertions, a programming construct for expressing computational constraints that LMs should satisfy. We integrate our constructs into the recent DSPy programming model for LMs, and present new strategies that allow DSPy to compile programs with LM Assertions into more reliable and accurate systems. We also propose strategies to use assertions at inference time for automatic self-refinement with LMs. We report on four diverse case studies for text generation and find that LM Assertions improve not only compliance with imposed rules but also downstream task performance, passing constraints up to 164% more often and generating up to 37% more higher-quality responses. Our reference implementation of LM Assertions is integrated into DSPy at https://github.com/stanfordnlp/dspy

Related papers

Self-Steering Language Models [113.96916935955842]
DisCIPL is a method for "self-steering" language models. DisCIPL uses a Planner model to generate a task-specific inference program. Our work opens up a design space of highly-parallelized Monte Carlo inference strategies.
arXiv Detail & Related papers (2025-04-09T17:54:22Z)
Reliable, Adaptable, and Attributable Language Models with Retrieval [144.26890121729514]
Parametric language models (LMs) are trained on vast amounts of web data. They face practical challenges such as hallucinations, difficulty in adapting to new data distributions, and a lack of verifiability. We advocate for retrieval-augmented LMs to replace parametric LMs as the next generation of LMs.
arXiv Detail & Related papers (2024-03-05T18:22:33Z)
Small Language Model Can Self-correct [42.76612128849389]
We introduce the underlineIntrinsic underlineSelf-underlineCorrection (ISC) in generative language models, aiming to correct the initial output of LMs in a self-triggered manner. We conduct experiments using LMs with parameters sizes ranging from 6 billion to 13 billion in two tasks, including commonsense reasoning and factual knowledge reasoning.
arXiv Detail & Related papers (2024-01-14T14:29:07Z)
LM-Polygraph: Uncertainty Estimation for Language Models [71.21409522341482]
Uncertainty estimation (UE) methods are one path to safer, more responsible, and more effective use of large language models (LLMs) We introduce LM-Polygraph, a framework with implementations of a battery of state-of-the-art UE methods for LLMs in text generation tasks, with unified program interfaces in Python. It introduces an extendable benchmark for consistent evaluation of UE techniques by researchers, and a demo web application that enriches the standard chat dialog with confidence scores.
arXiv Detail & Related papers (2023-11-13T15:08:59Z)
DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines [44.772892598128784]
We introduce DSPy, a programming model that abstracts LM pipelines as text transformation graphs. Within minutes of compiling, a few lines of DSPy allow GPT-3.5 and llama2-13b-chat to self-bootstrap pipelines.
arXiv Detail & Related papers (2023-10-05T17:37:25Z)
Simultaneous Machine Translation with Large Language Models [51.470478122113356]
We investigate the possibility of applying Large Language Models to SimulMT tasks. We conducted experiments using the textttLlama2-7b-chat model on nine different languages from the MUST-C dataset. The results show that LLM outperforms dedicated MT models in terms of BLEU and LAAL metrics.
arXiv Detail & Related papers (2023-09-13T04:06:47Z)
SatLM: Satisfiability-Aided Language Models Using Declarative Prompting [68.40726892904286]
We propose a new satisfiability-aided language modeling (SatLM) approach for improving the reasoning capabilities of large language models (LLMs) We use an LLM to generate a declarative task specification rather than an imperative program and leverage an off-the-shelf automated theorem prover to derive the final answer. We evaluate SATLM on 8 different datasets and show that it consistently outperforms program-aided LMs in the imperative paradigm.
arXiv Detail & Related papers (2023-05-16T17:55:51Z)
Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP [77.817293104436]
We propose a framework that relies on passing natural language texts in sophisticated pipelines between an LM and an RM. We have written novel DSP programs for answering questions in open-domain, multi-hop, and conversational settings.
arXiv Detail & Related papers (2022-12-28T18:52:44Z)
Prompting as Probing: Using Language Models for Knowledge Base Construction [1.6050172226234583]
We present ProP (Prompting as Probing), which utilizes GPT-3, a large Language Model originally proposed by OpenAI in 2020. ProP implements a multi-step approach that combines a variety of prompting techniques to achieve this. Our evaluation study indicates that these proposed techniques can substantially enhance the quality of the final predictions.
arXiv Detail & Related papers (2022-08-23T16:03:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.