Self-Supervised Inference of Agents in Trustless Environments
- URL: http://arxiv.org/abs/2409.08386v1
- Date: Thu, 12 Sep 2024 20:32:07 GMT
- Title: Self-Supervised Inference of Agents in Trustless Environments
- Authors: Vladyslav Larin, Ivan Nikitin, Alexander Firsov,
- Abstract summary: We propose a novel approach where agents can form swarms to produce high-quality responses effectively.
This is accomplished by utilizing agents capable of data inference and ranking.
We show that our approach is an order of magnitude faster than other trustless inference strategies reaching less than 125 ms validation latency.
- Score: 44.99833362998488
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we propose a novel approach where agents can form swarms to produce high-quality responses effectively. This is accomplished by utilizing agents capable of data inference and ranking, which can be effectively implemented using LLMs as response classifiers. We assess existing approaches for trustless agent inference, define our methodology, estimate practical parameters, and model various types of malicious agent attacks. Our method leverages the collective intelligence of swarms, ensuring robust and efficient decentralized AI inference with better accuracy, security, and reliability. We show that our approach is an order of magnitude faster than other trustless inference strategies reaching less than 125 ms validation latency.
Related papers
- From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.
We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z) - Deep Multi-Agent Reinforcement Learning for Decentralized Active
Hypothesis Testing [11.639503711252663]
We tackle the multi-agent active hypothesis testing (AHT) problem by introducing a novel algorithm rooted in the framework of deep multi-agent reinforcement learning.
We present a comprehensive set of experimental results that effectively showcase the agents' ability to learn collaborative strategies and enhance performance.
arXiv Detail & Related papers (2023-09-14T01:18:04Z) - Achieving Fairness in Multi-Agent Markov Decision Processes Using
Reinforcement Learning [30.605881670761853]
We propose a Reinforcement Learning approach to achieve fairness in finite-horizon episodic MDPs.
We show that such an approach achieves sub-linear regret in terms of the number of episodes.
arXiv Detail & Related papers (2023-06-01T03:43:53Z) - Byzantine-Robust Online and Offline Distributed Reinforcement Learning [60.970950468309056]
We consider a distributed reinforcement learning setting where multiple agents explore the environment and communicate their experiences through a central server.
$alpha$-fraction of agents are adversarial and can report arbitrary fake information.
We seek to identify a near-optimal policy for the underlying Markov decision process in the presence of these adversarial agents.
arXiv Detail & Related papers (2022-06-01T00:44:53Z) - Explaining Reinforcement Learning Policies through Counterfactual
Trajectories [147.7246109100945]
A human developer must validate that an RL agent will perform well at test-time.
Our method conveys how the agent performs under distribution shifts by showing the agent's behavior across a wider trajectory distribution.
In a user study, we demonstrate that our method enables users to score better than baseline methods on one of two agent validation tasks.
arXiv Detail & Related papers (2022-01-29T00:52:37Z) - Improving Model Robustness with Latent Distribution Locally and Globally [28.99007833855102]
In this work, we consider model robustness of deep neural networks against adversarial attacks from a global manifold perspective.
We propose a novel adversarial training method through robust optimization, and a tractable way to generate Latent Manifold Adrial Examples (LMAEs)
The proposed adversarial training with latent distribution (ATLD) method defends against adversarial attacks by crafting LMAEs with the latent manifold in an unsupervised manner.
arXiv Detail & Related papers (2021-07-08T07:52:53Z) - Deep Interactive Bayesian Reinforcement Learning via Meta-Learning [63.96201773395921]
The optimal adaptive behaviour under uncertainty over the other agents' strategies can be computed using the Interactive Bayesian Reinforcement Learning framework.
We propose to meta-learn approximate belief inference and Bayes-optimal behaviour for a given prior.
We show empirically that our approach outperforms existing methods that use a model-free approach, sample from the approximate posterior, maintain memory-free models of others, or do not fully utilise the known structure of the environment.
arXiv Detail & Related papers (2021-01-11T13:25:13Z) - Gaussian Process Based Message Filtering for Robust Multi-Agent
Cooperation in the Presence of Adversarial Communication [5.161531917413708]
We consider the problem of providing robustness to adversarial communication in multi-agent systems.
We propose a communication architecture based on Graph Neural Networks (GNNs)
We show that our filtering method is able to reduce the impact that non-cooperative agents cause.
arXiv Detail & Related papers (2020-12-01T14:21:58Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.