Can We Edit Multimodal Large Language Models?
- URL: http://arxiv.org/abs/2310.08475v5
- Date: Thu, 18 Apr 2024 15:46:22 GMT
- Title: Can We Edit Multimodal Large Language Models?
- Authors: Siyuan Cheng, Bozhong Tian, Qingbin Liu, Xi Chen, Yongheng Wang, Huajun Chen, Ningyu Zhang,
- Abstract summary: We construct a new benchmark, dubbed MMEdit, for editing multimodal LLMs.
We conduct comprehensive experiments involving various model editing baselines and analyze the impact of editing different components.
- Score: 38.292574747632635
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we focus on editing Multimodal Large Language Models (MLLMs). Compared to editing single-modal LLMs, multimodal model editing is more challenging, which demands a higher level of scrutiny and careful consideration in the editing process. To facilitate research in this area, we construct a new benchmark, dubbed MMEdit, for editing multimodal LLMs and establishing a suite of innovative metrics for evaluation. We conduct comprehensive experiments involving various model editing baselines and analyze the impact of editing different components for multimodal LLMs. Empirically, we notice that previous baselines can implement editing multimodal LLMs to some extent, but the effect is still barely satisfactory, indicating the potential difficulty of this task. We hope that our work can provide the NLP community with insights. Code and dataset are available in https://github.com/zjunlp/EasyEdit.
Related papers
- UniEdit: A Unified Knowledge Editing Benchmark for Large Language Models [16.546605509744015]
We introduce UniEdit, a unified benchmark for large language models (LLMs) editing grounded in open-domain knowledge.<n>First, we construct editing samples by selecting entities from 25 common domains across five major categories.<n>To address the issues of generality and locality in editing, we design an Neighborhood Multi-hop Chain Sampling (NMCS) algorithm.
arXiv Detail & Related papers (2025-05-18T10:19:01Z) - MindBridge: Scalable and Cross-Model Knowledge Editing via Memory-Augmented Modality [55.01380617388064]
Most existing methods overfit to specific models, causing edited knowledge to be discarded during each update.
We introduce MindBridge, a scalable solution inspired by the low coupling between modality processing and LLMs in multi-modal models.
MindBridge achieves superior performance even in editing tens of thousands of knowledge entries and can flexibly adapt to different LLMs.
arXiv Detail & Related papers (2025-03-04T15:17:57Z) - ComprehendEdit: A Comprehensive Dataset and Evaluation Framework for Multimodal Knowledge Editing [27.034072044001736]
Large multimodal language models (MLLMs) have revolutionized natural language processing and visual understanding.
Current knowledge editing evaluations are limited in scope and potentially biased.
We introduce ComprehendEdit, a comprehensive benchmark comprising eight diverse tasks from multiple datasets.
arXiv Detail & Related papers (2024-12-17T11:41:49Z) - Lifelong Knowledge Editing for Vision Language Models with Low-Rank Mixture-of-Experts [17.376346967267327]
We propose LiveEdit, a LIfelong Vision language modEl Edit to bridge the gap between lifelong LLM editing and Vision LLM editing.
A hard filtering mechanism is developed to utilize visual semantic knowledge, thereby eliminating visually irrelevant experts for input queries.
To integrate visually relevant experts, we introduce a soft routing mechanism based on textual semantic relevance to achieve multi-expert fusion.
arXiv Detail & Related papers (2024-11-23T03:19:40Z) - Instruction-Guided Editing Controls for Images and Multimedia: A Survey in LLM era [50.19334853510935]
Recent strides in instruction-based editing have enabled intuitive interaction with visual content, using natural language as a bridge between user intent and complex editing operations.
We aim to democratize powerful visual editing across various industries, from entertainment to education.
arXiv Detail & Related papers (2024-11-15T05:18:15Z) - Editing Conceptual Knowledge for Large Language Models [65.38231526537476]
This paper pioneers the investigation of editing conceptual knowledge for Large Language Models (LLMs)
We construct a novel benchmark dataset ConceptEdit and establish a suite of new metrics for evaluation.
experimental results reveal that, although existing editing methods can efficiently modify concept-level definition to some extent, they also have the potential to distort the related instantial knowledge.
arXiv Detail & Related papers (2024-03-10T16:57:10Z) - The Butterfly Effect of Model Editing: Few Edits Can Trigger Large Language Models Collapse [58.0132400208411]
Even a single edit can trigger model collapse, manifesting as significant performance degradation in various benchmark tasks.
benchmarking Large Language Models after each edit is impractically time-consuming and resource-intensive.
We have utilized GPT-3.5 to develop a new dataset, HardEdit, based on hard cases.
arXiv Detail & Related papers (2024-02-15T01:50:38Z) - MELO: Enhancing Model Editing with Neuron-Indexed Dynamic LoRA [34.21194537887934]
We propose a plug-in Model Editing method based on neuron-indexed dynamic LoRA (MELO)
Our proposed MELO achieves state-of-the-art editing performance on three sequential editing tasks.
arXiv Detail & Related papers (2023-12-19T02:11:01Z) - Macaw-LLM: Multi-Modal Language Modeling with Image, Audio, Video, and
Text Integration [50.94902442781148]
We propose a novel multi-modal large language model (LLM) that seamlessly integrates visual, audio, and textual information.
Macaw-LLM consists of three main components: a modality module for encoding multi-modal data, a cognitive module for harnessing pretrained LLMs, and an alignment module for harmonizing diverse representations.
We construct a large-scale multi-modal instruction dataset in terms of multi-turn dialogue, including 69K image instances and 50K video instances.
arXiv Detail & Related papers (2023-06-15T12:45:25Z) - Editing Large Language Models: Problems, Methods, and Opportunities [51.903537096207]
This paper embarks on a deep exploration of the problems, methods, and opportunities related to model editing for LLMs.
We provide an exhaustive overview of the task definition and challenges associated with model editing, along with an in-depth empirical analysis of the most progressive methods currently at our disposal.
Our objective is to provide valuable insights into the effectiveness and feasibility of each editing technique, thereby assisting the community in making informed decisions on the selection of the most appropriate method for a specific task or context.
arXiv Detail & Related papers (2023-05-22T16:00:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.