From Instructions to Constraints: Language Model Alignment with
Automatic Constraint Verification
- URL: http://arxiv.org/abs/2403.06326v1
- Date: Sun, 10 Mar 2024 22:14:54 GMT
- Title: From Instructions to Constraints: Language Model Alignment with
Automatic Constraint Verification
- Authors: Fei Wang, Chao Shang, Sarthak Jain, Shuai Wang, Qiang Ning, Bonan Min,
Vittorio Castelli, Yassine Benajiba, Dan Roth
- Abstract summary: We investigate common constraints in NLP tasks, categorize them into three classes based on the types of their arguments.
We propose a unified framework, ACT (Aligning to ConsTraints), to automatically produce supervision signals for user alignment with constraints.
- Score: 70.08146540745877
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: User alignment is crucial for adapting general-purpose language models (LMs)
to downstream tasks, but human annotations are often not available for all
types of instructions, especially those with customized constraints. We observe
that user instructions typically contain constraints. While assessing response
quality in terms of the whole instruction is often costly, efficiently
evaluating the satisfaction rate of constraints is feasible. We investigate
common constraints in NLP tasks, categorize them into three classes based on
the types of their arguments, and propose a unified framework, ACT (Aligning to
ConsTraints), to automatically produce supervision signals for user alignment
with constraints. Specifically, ACT uses constraint verifiers, which are
typically easy to implement in practice, to compute constraint satisfaction
rate (CSR) of each response. It samples multiple responses for each prompt and
collect preference labels based on their CSR automatically. Subsequently, ACT
adapts the LM to the target task through a ranking-based learning process.
Experiments on fine-grained entity typing, abstractive summarization, and
temporal question answering show that ACT is able to enhance LMs' capability to
adhere to different classes of constraints, thereby improving task performance.
Further experiments show that the constraint-following capabilities are
transferable.
Related papers
- Multi-Attribute Constraint Satisfaction via Language Model Rewriting [67.5778646504987]
Multi-Attribute Constraint Satisfaction (MACS) is a method capable of finetuning language models to satisfy user-specified constraints on multiple external real-value attributes.
Our work opens new avenues for generalized and real-value multi-attribute control, with implications for diverse applications spanning NLP and bioinformatics.
arXiv Detail & Related papers (2024-12-26T12:36:39Z) - Divide-Verify-Refine: Aligning LLM Responses with Complex Instructions [33.18076221854853]
LLMs struggle to follow complex instructions with multiple constraints.
Recent studies show that LLMs, particularly open-source models, struggle to follow complex instructions with multiple constraints.
We propose the Divide-Verify-Refine (DVR) framework with three steps.
We show that the framework significantly improves performance, doubling LLama3.1-8B's constraint adherence on instructions with 6 constraints.
arXiv Detail & Related papers (2024-10-16T04:01:55Z) - Benchmarking Large Language Models on Controllable Generation under
Diversified Instructions [34.89012022437519]
Large language models (LLMs) have exhibited impressive instruction-following capabilities.
It is still unclear whether and to what extent they can respond to explicit constraints that might be entailed in various instructions.
We propose a new benchmark CoDI-Eval to evaluate LLMs' responses to instructions with various constraints.
arXiv Detail & Related papers (2024-01-01T07:35:31Z) - FollowBench: A Multi-level Fine-grained Constraints Following Benchmark for Large Language Models [79.62191017182518]
FollowBench is a benchmark for Fine-grained Constraints Following Benchmark for Large Language Models.
We introduce a Multi-level mechanism that incrementally adds a single constraint to the initial instruction at each increased level.
By evaluating 13 popular LLMs on FollowBench, we highlight the weaknesses of LLMs in instruction following and point towards potential avenues for future work.
arXiv Detail & Related papers (2023-10-31T12:32:38Z) - Eliciting Human Preferences with Language Models [56.68637202313052]
Language models (LMs) can be directed to perform target tasks by using labeled examples or natural language prompts.
We propose to use *LMs themselves* to guide the task specification process.
We study GATE in three domains: email validation, content recommendation, and moral reasoning.
arXiv Detail & Related papers (2023-10-17T21:11:21Z) - Self-regulating Prompts: Foundational Model Adaptation without
Forgetting [112.66832145320434]
We introduce a self-regularization framework for prompting called PromptSRC.
PromptSRC guides the prompts to optimize for both task-specific and task-agnostic general representations.
arXiv Detail & Related papers (2023-07-13T17:59:35Z) - Generative Prompt Tuning for Relation Classification [21.027631157115135]
We propose a novel generative prompt tuning method to reformulate relation classification as an infilling problem.
In addition, we design entity-guided decoding and discriminative relation scoring to generate and align relations effectively and efficiently during inference.
arXiv Detail & Related papers (2022-10-22T12:40:23Z) - Controllable Summarization with Constrained Markov Decision Process [50.04321779376415]
We study controllable text summarization which allows users to gain control on a particular attribute.
We propose a novel training framework based on Constrained Markov Decision Process (CMDP)
Our framework can be applied to control important attributes of summarization, including length, covered entities, and abstractiveness.
arXiv Detail & Related papers (2021-08-07T09:12:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.