Related papers: Self-Supervised Inference of Agents in Trustless Environments

Related papers

Agent-Omit: Training Efficient LLM Agents for Adaptive Thought and Observation Omission via Agentic Reinforcement Learning [15.39565540937229]
Managing agent thought and observation during multi-turn agent-environment interactions is an emerging strategy to improve efficiency.<n>We propose Agent-Omit, a unified training framework that empowers LLM agents to adaptively omit redundant thoughts and observations.<n> Experimental results show that our constructed Agent-Omit-8B could obtain performance comparable to seven frontier LLM agents.
arXiv Detail & Related papers (2026-02-04T07:26:23Z)
Probably Approximately Global Robustness Certification [2.957223821964636]
We propose and investigate probabilistic guarantees for the adversarial robustness of classification algorithms.<n>Key idea is to sample an $epsilon$-net and invoke a local robustness oracle on the sample.<n>Our approach can be applied even to large neural networks that are beyond the scope of traditional formal verification.
arXiv Detail & Related papers (2025-11-09T18:46:22Z)
Enhancing Robustness of LLM-Driven Multi-Agent Systems through Randomized Smoothing [13.997409139696556]
This paper presents a framework for enhancing the safety of large language model (LLM) empowered multi-agent systems (MAS) in safety-critical domains such as aerospace.<n>We apply randomized smoothing, a statistical robustness certification technique, to the MAS consensus context, enabling probabilistic guarantees on agent decisions under adversarial influence.
arXiv Detail & Related papers (2025-07-05T17:26:08Z)
Query-Level Uncertainty in Large Language Models [13.195074492564332]
We introduce a novel and training-free method called emphInternal Confidence, which leverages self-evaluations across layers and tokens.<n> Empirical results on both factual QA and mathematical reasoning tasks demonstrate that our internal confidence can outperform several baselines.<n>Our proposed method can be used for efficient RAG and model cascading, which is able to reduce inference costs while maintaining performance.
arXiv Detail & Related papers (2025-06-11T12:39:48Z)
Uncertainty Quantification for LLMs through Minimum Bayes Risk: Bridging Confidence and Consistency [66.96286531087549]
Uncertainty quantification (UQ) methods for Large Language Models (LLMs) encompass a variety of approaches.<n>We propose a novel approach to integrating model confidence with output consistency, resulting in a family of efficient and robust UQ methods.<n>We evaluate our approach across various tasks such as question answering, abstractive summarization, and machine translation.
arXiv Detail & Related papers (2025-02-07T14:30:12Z)
Refining Input Guardrails: Enhancing LLM-as-a-Judge Efficiency Through Chain-of-Thought Fine-Tuning and Alignment [2.9775785740619254]
Large Language Models (LLMs) have demonstrated powerful capabilities that render them valuable in different applications, including conversational AI products. It is paramount to ensure the security and reliability of these products by mitigating their vulnerabilities towards malicious user interactions. We present a study on the efficacy of fine-tuning and aligning Chain-of-Thought (CoT) responses of different LLMs that serve as input moderation guardrails.
arXiv Detail & Related papers (2025-01-22T18:40:57Z)
From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process. We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z)
Deep Multi-Agent Reinforcement Learning for Decentralized Active Hypothesis Testing [11.639503711252663]
We tackle the multi-agent active hypothesis testing (AHT) problem by introducing a novel algorithm rooted in the framework of deep multi-agent reinforcement learning. We present a comprehensive set of experimental results that effectively showcase the agents' ability to learn collaborative strategies and enhance performance.
arXiv Detail & Related papers (2023-09-14T01:18:04Z)
Achieving Fairness in Multi-Agent Markov Decision Processes Using Reinforcement Learning [30.605881670761853]
We propose a Reinforcement Learning approach to achieve fairness in finite-horizon episodic MDPs. We show that such an approach achieves sub-linear regret in terms of the number of episodes.
arXiv Detail & Related papers (2023-06-01T03:43:53Z)
Byzantine-Robust Online and Offline Distributed Reinforcement Learning [60.970950468309056]
We consider a distributed reinforcement learning setting where multiple agents explore the environment and communicate their experiences through a central server. $alpha$-fraction of agents are adversarial and can report arbitrary fake information. We seek to identify a near-optimal policy for the underlying Markov decision process in the presence of these adversarial agents.
arXiv Detail & Related papers (2022-06-01T00:44:53Z)
Explaining Reinforcement Learning Policies through Counterfactual Trajectories [147.7246109100945]
A human developer must validate that an RL agent will perform well at test-time. Our method conveys how the agent performs under distribution shifts by showing the agent's behavior across a wider trajectory distribution. In a user study, we demonstrate that our method enables users to score better than baseline methods on one of two agent validation tasks.
arXiv Detail & Related papers (2022-01-29T00:52:37Z)
Improving Model Robustness with Latent Distribution Locally and Globally [28.99007833855102]
In this work, we consider model robustness of deep neural networks against adversarial attacks from a global manifold perspective. We propose a novel adversarial training method through robust optimization, and a tractable way to generate Latent Manifold Adrial Examples (LMAEs) The proposed adversarial training with latent distribution (ATLD) method defends against adversarial attacks by crafting LMAEs with the latent manifold in an unsupervised manner.
arXiv Detail & Related papers (2021-07-08T07:52:53Z)
Deep Interactive Bayesian Reinforcement Learning via Meta-Learning [63.96201773395921]
The optimal adaptive behaviour under uncertainty over the other agents' strategies can be computed using the Interactive Bayesian Reinforcement Learning framework. We propose to meta-learn approximate belief inference and Bayes-optimal behaviour for a given prior. We show empirically that our approach outperforms existing methods that use a model-free approach, sample from the approximate posterior, maintain memory-free models of others, or do not fully utilise the known structure of the environment.
arXiv Detail & Related papers (2021-01-11T13:25:13Z)
Gaussian Process Based Message Filtering for Robust Multi-Agent Cooperation in the Presence of Adversarial Communication [5.161531917413708]
We consider the problem of providing robustness to adversarial communication in multi-agent systems. We propose a communication architecture based on Graph Neural Networks (GNNs) We show that our filtering method is able to reduce the impact that non-cooperative agents cause.
arXiv Detail & Related papers (2020-12-01T14:21:58Z)
Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples. We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries. We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.