Related papers: DRESS: Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback

DRESS: Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback

URL: http://arxiv.org/abs/2311.10081v2
Date: Tue, 19 Mar 2024 17:51:45 GMT
Title: DRESS: Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback
Authors: Yangyi Chen, Karan Sikka, Michael Cogswell, Heng Ji, Ajay Divakaran,
Abstract summary: We present DRESS, a large vision language model (LVLM) that innovatively exploits Natural Language feedback (NLF) from Large Language Models. We propose a novel categorization of the NLF into two key types: critique and refinement. Our experimental results demonstrate that DRESS can generate more helpful (9.76%), honest (11.52%), and harmless (21.03%) responses.
Score: 61.28463542324576
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: We present DRESS, a large vision language model (LVLM) that innovatively exploits Natural Language feedback (NLF) from Large Language Models to enhance its alignment and interactions by addressing two key limitations in the state-of-the-art LVLMs. First, prior LVLMs generally rely only on the instruction finetuning stage to enhance alignment with human preferences. Without incorporating extra feedback, they are still prone to generate unhelpful, hallucinated, or harmful responses. Second, while the visual instruction tuning data is generally structured in a multi-turn dialogue format, the connections and dependencies among consecutive conversational turns are weak. This reduces the capacity for effective multi-turn interactions. To tackle these, we propose a novel categorization of the NLF into two key types: critique and refinement. The critique NLF identifies the strengths and weaknesses of the responses and is used to align the LVLMs with human preferences. The refinement NLF offers concrete suggestions for improvement and is adopted to improve the interaction ability of the LVLMs-- which focuses on LVLMs' ability to refine responses by incorporating feedback in multi-turn interactions. To address the non-differentiable nature of NLF, we generalize conditional reinforcement learning for training. Our experimental results demonstrate that DRESS can generate more helpful (9.76%), honest (11.52%), and harmless (21.03%) responses, and more effectively learn from feedback during multi-turn interactions compared to SOTA LVMLs.

Related papers

Zero-Shot LLMs in Human-in-the-Loop RL: Replacing Human Feedback for Reward Shaping [0.0]
Reinforcement learning often faces challenges with reward misalignment. Human-in-the-loop (HIL) methods may exacerbate the problem, as humans are prone to biases that lead to inconsistent, subjective, or misaligned feedback.
arXiv Detail & Related papers (2025-03-26T03:17:12Z)
SEFL: Harnessing Large Language Model Agents to Improve Educational Feedback Systems [5.191286314473505]
Synthetic Educational Feedback Loops (SEFL) is a novel framework designed to deliver immediate, on-demand feedback at scale. Two large language models (LLMs) operate in teacher--student roles to simulate assignment completion and formative feedback. We show that SEFL-tuned models outperform their non-tuned counterparts in feedback quality, clarity, and timeliness.
arXiv Detail & Related papers (2025-02-18T15:09:29Z)
Preference VLM: Leveraging VLMs for Scalable Preference-Based Reinforcement Learning [17.59802090014789]
We introduce PrefVLM, a framework that integrates Vision-Language Models (VLMs) with selective human feedback. Our method leverages VLMs to generate initial preference labels, which are then filtered to identify uncertain cases for targeted human annotation. Experiments on Meta-World manipulation tasks demonstrate that PrefVLM achieves comparable or superior success rates to state-of-the-art methods.
arXiv Detail & Related papers (2025-02-03T18:50:15Z)
Looking Beyond Text: Reducing Language bias in Large Vision-Language Models via Multimodal Dual-Attention and Soft-Image Guidance [67.26434607115392]
Large vision-language models (LVLMs) have achieved impressive results in various vision-language tasks. LVLMs suffer from hallucinations caused by language bias, leading to diminished focus on images and ineffective visual comprehension. We propose LACING to address the language bias of LVLMs with muLtimodal duAl-attention meChanIsm (MDA) aNd soft-image Guidance (IFG)
arXiv Detail & Related papers (2024-11-21T16:33:30Z)
VLFeedback: A Large-Scale AI Feedback Dataset for Large Vision-Language Models Alignment [55.7956150385255]
We investigate the efficacy of AI feedback to scale supervision for aligning vision-language models. We introduce VLFeedback, the first large-scale vision-language feedback dataset. We train Silkie, an LVLM fine-tuned via direct preference optimization on VLFeedback.
arXiv Detail & Related papers (2024-10-12T07:56:47Z)
MACAROON: Training Vision-Language Models To Be Your Engaged Partners [95.32771929749514]
Large vision-language models (LVLMs) generate detailed responses even when questions are ambiguous or unlabeled. In this study, we aim to shift LVLMs from passive answer providers to proactive engaged partners. We introduce MACAROON, self-iMaginAtion for ContrAstive pReference OptimizatiON, which instructs LVLMs to autonomously generate contrastive response pairs for unlabeled questions.
arXiv Detail & Related papers (2024-06-20T09:27:33Z)
FGAIF: Aligning Large Vision-Language Models with Fine-grained AI Feedback [16.24562885483636]
We propose an innovative method to align modalities in Large Vision-Language Models (LVLMs) through Fine-Grained Artificial Intelligence Feedback (FGAIF) Specifically, we first utilize AI tools to predict the types of hallucination for each segment in the response and obtain a collection of fine-grained feedback. Then, based on the collected reward data, three specialized reward models are trained to produce dense rewards. Finally, a novel fine-grained feedback module is integrated into the Proximal Policy Optimization (PPO) algorithm.
arXiv Detail & Related papers (2024-04-07T19:00:45Z)
How do Large Language Models Navigate Conflicts between Honesty and Helpfulness? [14.706111954807021]
We use psychological models and experiments designed to characterize human behavior to analyze large language models. We find that reinforcement learning from human feedback improves both honesty and helpfulness. GPT-4 Turbo demonstrates human-like response patterns including sensitivity to the conversational framing and listener's decision context.
arXiv Detail & Related papers (2024-02-11T19:13:26Z)
Data-Efficient Alignment of Large Language Models with Human Feedback Through Natural Language [31.0723480021355]
We investigate data efficiency of modeling human feedback that is in natural language. We fine-tune an open-source LLM, e.g., Falcon-40B-Instruct, on a relatively small amount of human feedback in natural language. We show that this model is able to improve the quality of responses from even some of the strongest LLMs.
arXiv Detail & Related papers (2023-11-24T15:20:36Z)
Are Large Language Models Really Robust to Word-Level Perturbations? [68.60618778027694]
We propose a novel rational evaluation approach that leverages pre-trained reward models as diagnostic tools. Longer conversations manifest the comprehensive grasp of language models in terms of their proficiency in understanding questions. Our results demonstrate that LLMs frequently exhibit vulnerability to word-level perturbations that are commonplace in daily language usage.
arXiv Detail & Related papers (2023-09-20T09:23:46Z)
Training Language Models with Language Feedback at Scale [50.70091340506957]
We introduce learning from Language Feedback (ILF), a new approach that utilizes more informative language feedback. ILF consists of three steps that are applied iteratively: first, conditioning the language model on the input, an initial LM output, and feedback to generate refinements. We show theoretically that ILF can be viewed as Bayesian Inference, similar to Reinforcement Learning from human feedback.
arXiv Detail & Related papers (2023-03-28T17:04:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.