Privacy-Preserving Instructions for Aligning Large Language Models
- URL: http://arxiv.org/abs/2402.13659v2
- Date: Tue, 2 Jul 2024 11:57:07 GMT
- Title: Privacy-Preserving Instructions for Aligning Large Language Models
- Authors: Da Yu, Peter Kairouz, Sewoong Oh, Zheng Xu,
- Abstract summary: We propose using synthetic instructions to replace real instructions in data annotation and model fine-tuning.
Formal differential privacy is guaranteed by generating those synthetic instructions using privately fine-tuned generators.
In supervised fine-tuning, models trained with private synthetic instructions outperform leading open-source models such as Vicuna.
- Score: 45.712223061539284
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Service providers of large language model (LLM) applications collect user instructions in the wild and use them in further aligning LLMs with users' intentions. These instructions, which potentially contain sensitive information, are annotated by human workers in the process. This poses a new privacy risk not addressed by the typical private optimization. To this end, we propose using synthetic instructions to replace real instructions in data annotation and model fine-tuning. Formal differential privacy is guaranteed by generating those synthetic instructions using privately fine-tuned generators. Crucial in achieving the desired utility is our novel filtering algorithm that matches the distribution of the synthetic instructions to that of the real ones. In both supervised fine-tuning and reinforcement learning from human feedback, our extensive experiments demonstrate the high utility of the final set of synthetic instructions by showing comparable results to real instructions. In supervised fine-tuning, models trained with private synthetic instructions outperform leading open-source models such as Vicuna.
Related papers
- CodecLM: Aligning Language Models with Tailored Synthetic Data [51.59223474427153]
We introduce CodecLM, a framework for adaptively generating high-quality synthetic data for instruction-following abilities.
We first encode seed instructions into metadata, which are concise keywords generated on-the-fly to capture the target instruction distribution.
We also introduce Self-Rubrics and Contrastive Filtering during decoding to tailor data-efficient samples.
arXiv Detail & Related papers (2024-04-08T21:15:36Z) - Can LLMs Generate Human-Like Wayfinding Instructions? Towards Platform-Agnostic Embodied Instruction Synthesis [51.04181562775778]
We present a novel approach to automatically synthesize "wayfinding instructions" for an embodied robot agent.
Our algorithm uses in-context learning to condition an LLM to generate instructions using just a few references.
We implement our approach on multiple simulation platforms including Matterport3D, AI Habitat and ThreeDWorld.
arXiv Detail & Related papers (2024-03-18T05:38:07Z) - Instructive Decoding: Instruction-Tuned Large Language Models are
Self-Refiner from Noisy Instructions [26.192531184689763]
This paper presents Instructive Decoding (ID), a simple yet effective approach that augments the efficacy of instruction-tuned models.
ID adjusts the logits for next-token prediction in a contrastive manner, utilizing predictions generated from a manipulated version of the original instruction.
We conduct experiments across a spectrum of such noisy instructions, ranging from those that insert semantic noise via random words to others like 'opposite' that elicit deviated responses.
arXiv Detail & Related papers (2023-11-01T02:31:35Z) - Instruct and Extract: Instruction Tuning for On-Demand Information
Extraction [86.29491354355356]
On-Demand Information Extraction aims to fulfill the personalized demands of real-world users.
We present a benchmark named InstructIE, inclusive of both automatically generated training data, as well as the human-annotated test set.
Building on InstructIE, we further develop an On-Demand Information Extractor, ODIE.
arXiv Detail & Related papers (2023-10-24T17:54:25Z) - Evaluating the Zero-shot Robustness of Instruction-tuned Language Models [23.488398944358643]
We find that using novel (unobserved) but appropriate instruction phrasings consistently degrades model performance.
We propose a simple method to mitigate this issue by introducing soft prompt'' embedding parameters.
We show that this method consistently improves the robustness of instruction-tuned models.
arXiv Detail & Related papers (2023-06-20T03:48:51Z) - Recommendation as Instruction Following: A Large Language Model
Empowered Recommendation Approach [83.62750225073341]
We consider recommendation as instruction following by large language models (LLMs)
We first design a general instruction format for describing the preference, intention, task form and context of a user in natural language.
Then we manually design 39 instruction templates and automatically generate a large amount of user-personalized instruction data.
arXiv Detail & Related papers (2023-05-11T17:39:07Z) - Self-Instruct: Aligning Language Models with Self-Generated Instructions [76.42871502364697]
Self-Instruct is a framework for improving the instruction-following capabilities of pretrained language models.
Our pipeline generates instructions, input, and output samples from a language model, then filters invalid or similar ones before using them to finetune the original model.
For further evaluation, we curate a set of expert-written instructions for novel tasks, and show through human evaluation that tuning GPT3 with Self-Instruct outperforms using existing public instruction datasets by a large margin.
arXiv Detail & Related papers (2022-12-20T18:59:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.