Related papers: Stream Aligner: Efficient Sentence-Level Alignment via Distribution Induction

Stream Aligner: Efficient Sentence-Level Alignment via Distribution Induction

URL: http://arxiv.org/abs/2501.05336v1
Date: Thu, 09 Jan 2025 16:02:51 GMT
Title: Stream Aligner: Efficient Sentence-Level Alignment via Distribution Induction
Authors: Hantao Lou, Jiaming Ji, Kaile Wang, Yaodong Yang,
Abstract summary: Stream Aligner combines efficiency with enhanced performance in various tasks throughout the generation process.<n>Compared to Aligner, our experiments demonstrate that Stream Aligner reduces reliance on the capabilities of additional models, enhances the reasoning abilities of LLMs, and decreases latency during user interaction.
Score: 6.624814871290537
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The rapid advancement of large language models (LLMs) has led to significant improvements in their capabilities, but also to increased concerns about their alignment with human values and intentions. Current alignment strategies, including adaptive training and inference-time methods, have demonstrated potential in this area. However, these approaches still struggle to balance deployment complexity and capability across various tasks and difficulties. In this work, we introduce the Streaming Distribution Induce Aligner (Stream Aligner), a novel alignment paradigm that combines efficiency with enhanced performance in various tasks throughout the generation process. Stream Aligner achieves dynamic sentence-level correction by using a small model to learn the preferences of the suffix sentence, iteratively correcting the suffix sentence output by the upstream model, and then using the corrected sentence to replace the suffix sentence in subsequent generations. Compared to Aligner, our experiments demonstrate that Stream Aligner reduces reliance on the capabilities of additional models, enhances the reasoning abilities of LLMs, and decreases latency during user interaction. Specifically, Stream Aligner-2B model has achieved an improvement of 76.1% in helpfulness, 36.0% in harmlessness on the tested Llama2-70B-chat model, and Stream Aligner-8B has achieved an improvement of 3.5% on the math ability of the tested Llama3-70B-Instruct model.

Related papers

ARIES: Stimulating Self-Refinement of Large Language Models by Iterative Preference Optimization [34.77238246296517]
A truly intelligent Large Language Model (LLM) should be capable of correcting errors in its responses through external interactions. We introduce a novel post-training and inference framework, called ARIES: Adaptive Refinement and Iterative Enhancement Structure. ARIES iteratively performs preference training and self-refinement-based data collection.
arXiv Detail & Related papers (2025-02-08T15:21:55Z)
How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario [72.02391485962127]
Speech Self-Supervised Learning (SSL) models achieve impressive performance on Automatic Speech Recognition (ASR) In low-resource language ASR, they encounter the domain mismatch problem between pre-trained and low-resource languages. We extend a conventional efficient fine-tuning scheme based on the adapter to handle these issues.
arXiv Detail & Related papers (2024-11-27T10:51:00Z)
Enhancing In-Context Learning via Implicit Demonstration Augmentation [26.78252788538567]
In-context learning (ICL) enables pre-trained language models to make predictions for unseen inputs without updating parameters. Despite its potential, ICL's effectiveness heavily relies on the quality, quantity, and permutation of demonstrations. In this paper, we tackle this challenge for the first time from the perspective of demonstration augmentation.
arXiv Detail & Related papers (2024-06-27T05:25:46Z)
Efficient Continual Pre-training by Mitigating the Stability Gap [68.49269649759005]
We study the behavior of Large Language Models (LLMs) during continual pre-training. We propose three effective strategies to enhance LLM performance within a fixed compute budget. Our strategies improve the average medical task performance of the OpenLlama-3B model from 36.2% to 40.7% with only 40% of the original training budget.
arXiv Detail & Related papers (2024-06-21T02:28:37Z)
Language Rectified Flow: Advancing Diffusion Language Generation with Probabilistic Flows [53.31856123113228]
This paper proposes Language Rectified Flow (ours) Our method is based on the reformulation of the standard probabilistic flow models. Experiments and ablation studies demonstrate that our method can be general, effective, and beneficial for many NLP tasks.
arXiv Detail & Related papers (2024-03-25T17:58:22Z)
Aligner: Efficient Alignment by Learning to Correct [10.056049435141645]
We introduce Aligner, a model-agnostic, plug-and-play module that learns the correctional residuals between preferred and dispreferred answers. It can be applied to various open-source and API-based models with only one-off training, making it suitable for rapid iteration. Our experiments demonstrate performance improvements by deploying the same Aligner model across 11 different language models.
arXiv Detail & Related papers (2024-02-04T09:24:51Z)
Accelerating LLaMA Inference by Enabling Intermediate Layer Decoding via Instruction Tuning with LITE [62.13435256279566]
Large Language Models (LLMs) have achieved remarkable performance across a wide variety of natural language tasks. However, their large size makes their inference slow and computationally expensive. We show that it enables these layers to acquire 'good' generation ability without affecting the generation ability of the final layer.
arXiv Detail & Related papers (2023-10-28T04:07:58Z)
A Self-Paced Mixed Distillation Method for Non-Autoregressive Generation [135.84684279852098]
Non-Autoregressive (NAR) models significantly under-perform Auto-regressive (AR) models on various language generation tasks. Among the NAR models, BANG is the first large-scale pre-training model on English un-labeled raw text corpus. We propose a novel self-paced mixed distillation method to further improve the generation quality of BANG.
arXiv Detail & Related papers (2022-05-23T09:54:53Z)
RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions [73.45995446500312]
We analyze the generalization properties of streaming and non-streaming recurrent neural network transducer (RNN-T) based end-to-end models. We propose two solutions: combining multiple regularization techniques during training, and using dynamic overlapping inference.
arXiv Detail & Related papers (2020-05-07T06:24:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.