DroidSpeak: Enhancing Cross-LLM Communication
- URL: http://arxiv.org/abs/2411.02820v1
- Date: Tue, 05 Nov 2024 05:41:41 GMT
- Title: DroidSpeak: Enhancing Cross-LLM Communication
- Authors: Yuhan Liu, Esha Choukse, Shan Lu, Junchen Jiang, Madan Musuvathi,
- Abstract summary: We introduce DroidSpeak, a novel framework to target this cross-LLM communication.
We efficiently bypass the need to reprocess entire contexts for fine-tuned versions of the same foundational model.
Our findings underscore the potential to create more efficient and scalable multi-agent systems.
- Score: 15.901409892276288
- License:
- Abstract: In multi-agent systems utilizing Large Language Models (LLMs), communication between agents traditionally relies on natural language. This communication often includes the full context of the query so far, which can introduce significant prefill-phase latency, especially with long contexts. We introduce DroidSpeak, a novel framework to target this cross-LLM communication by leveraging the reuse of intermediate data, such as input embeddings (E-cache) and key-value caches (KV-cache). We efficiently bypass the need to reprocess entire contexts for fine-tuned versions of the same foundational model. This approach allows faster context integration while maintaining the quality of task performance. Experimental evaluations demonstrate DroidSpeak's ability to significantly accelerate inter-agent communication, achieving up to a 2.78x speedup in prefill latency with negligible loss in accuracy. Our findings underscore the potential to create more efficient and scalable multi-agent systems.
Related papers
- Remote Timing Attacks on Efficient Language Model Inference [63.79839291641793]
We show it is possible to exploit timing differences to mount a timing attack.
We show how it is possible to learn the topic of a user's conversation with 90%+ precision.
An adversary can leverage a boosting attack to recover PII placed in messages for open source systems.
arXiv Detail & Related papers (2024-10-22T16:51:36Z) - Hello Again! LLM-powered Personalized Agent for Long-term Dialogue [63.65128176360345]
We introduce a model-agnostic framework, the Long-term Dialogue Agent (LD-Agent)
It incorporates three independently tunable modules dedicated to event perception, persona extraction, and response generation.
The effectiveness, generality, and cross-domain capabilities of LD-Agent are empirically demonstrated.
arXiv Detail & Related papers (2024-06-09T21:58:32Z) - Agent-driven Generative Semantic Communication with Cross-Modality and Prediction [57.335922373309074]
We propose a novel agent-driven generative semantic communication framework based on reinforcement learning.
In this work, we develop an agent-assisted semantic encoder with cross-modality capability, which can track the semantic changes, channel condition, to perform adaptive semantic extraction and sampling.
The effectiveness of the designed models has been verified using the UA-DETRAC dataset, demonstrating the performance gains of the overall A-GSC framework.
arXiv Detail & Related papers (2024-04-10T13:24:27Z) - Context-aware Communication for Multi-agent Reinforcement Learning [6.109127175562235]
We develop a context-aware communication scheme for multi-agent reinforcement learning (MARL)
In the first stage, agents exchange coarse representations in a broadcast fashion, providing context for the second stage.
Following this, agents utilize attention mechanisms in the second stage to selectively generate messages personalized for the receivers.
To evaluate the effectiveness of CACOM, we integrate it with both actor-critic and value-based MARL algorithms.
arXiv Detail & Related papers (2023-12-25T03:33:08Z) - TESS: A Multi-intent Parser for Conversational Multi-Agent Systems with
Decentralized Natural Language Understanding Models [6.470108226184637]
Multi-agent systems complicate the natural language understanding of user intents.
We propose an efficient parsing and orchestration pipeline algorithm to service multi-intent utterances from the user.
arXiv Detail & Related papers (2023-12-19T03:39:23Z) - Communication-Efficient Federated Optimization over Semi-Decentralized
Networks [42.11743453542266]
Communication efficiency is one of the most challenging bottlenecks in large-scale networks.
We study the communication efficiency under semi-decentralized communication protocol, in which agents can perform both agent-to-agent and agent-to-server communication.
arXiv Detail & Related papers (2023-11-30T18:37:15Z) - Leveraging Timestamp Information for Serialized Joint Streaming
Recognition and Translation [51.399695200838586]
We propose a streaming Transformer-Transducer (T-T) model able to jointly produce many-to-one and one-to-many transcription and translation using a single decoder.
Experiments on it,es,de->en prove the effectiveness of our approach, enabling the generation of one-to-many joint outputs with a single decoder for the first time.
arXiv Detail & Related papers (2023-10-23T11:00:27Z) - Towards Efficient Dialogue Pre-training with Transferable and
Interpretable Latent Structure [77.30953347462452]
This paper proposes a novel dialogue generation model with a latent structure that is easily transferable from the general domain to downstream tasks in a lightweight and transparent way.
Thanks to the transferable latent structure, our model is able to yield better dialogue responses than four strong baselines in terms of both automatic and human evaluations.
arXiv Detail & Related papers (2022-10-22T14:46:43Z) - Accelerating Federated Edge Learning via Optimized Probabilistic Device
Scheduling [57.271494741212166]
This paper formulates and solves the communication time minimization problem.
It is found that the optimized policy gradually turns its priority from suppressing the remaining communication rounds to reducing per-round latency as the training process evolves.
The effectiveness of the proposed scheme is demonstrated via a use case on collaborative 3D objective detection in autonomous driving.
arXiv Detail & Related papers (2021-07-24T11:39:17Z) - Minimizing Communication while Maximizing Performance in Multi-Agent
Reinforcement Learning [5.612141846711729]
Inter-agent communication can significantly increase performance in multi-agent tasks that require co-ordination.
In real-world applications, where communication may be limited by system constraints like bandwidth, power and network capacity, one might need to reduce the number of messages that are sent.
We show that we can reduce communication by 75% with no loss of performance.
arXiv Detail & Related papers (2021-06-15T23:13:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.