Related papers: Can Optimal Transport Improve Federated Inverse Reinforcement Learning?

Can Optimal Transport Improve Federated Inverse Reinforcement Learning?

URL: http://arxiv.org/abs/2601.00309v1
Date: Thu, 01 Jan 2026 11:13:34 GMT
Title: Can Optimal Transport Improve Federated Inverse Reinforcement Learning?
Authors: David Millard, Ali Baheri,
Abstract summary: This paper introduces an optimal transport-based approach to federated inverse reinforcement learning (IRL)<n>We prove that this barycentric fusion yields a more faithful global reward estimate than conventional parameter averaging methods in federated learning.<n>Overall, this work provides a principled and communication-efficient framework for deriving a shared reward that generalizes across heterogeneous agents and environments.
Score: 5.927569454272587
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In robotics and multi-agent systems, fleets of autonomous agents often operate in subtly different environments while pursuing a common high-level objective. Directly pooling their data to learn a shared reward function is typically impractical due to differences in dynamics, privacy constraints, and limited communication bandwidth. This paper introduces an optimal transport-based approach to federated inverse reinforcement learning (IRL). Each client first performs lightweight Maximum Entropy IRL locally, adhering to its computational and privacy limitations. The resulting reward functions are then fused via a Wasserstein barycenter, which considers their underlying geometric structure. We further prove that this barycentric fusion yields a more faithful global reward estimate than conventional parameter averaging methods in federated learning. Overall, this work provides a principled and communication-efficient framework for deriving a shared reward that generalizes across heterogeneous agents and environments.

Related papers

Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn LLM Agents [28.145430029174577]
Large language model (LLM)-based agents are increasingly trained with reinforcement learning (RL) to enhance their ability to interact with external environments.<n>Existing approaches typically rely on outcome-based rewards that are only provided at the final answer.<n>In this paper, we propose Information Gain-based Policy Optimization (IGPO), a simple yet effective RL framework that provides dense and intrinsic supervision for multi-turn agent training.
arXiv Detail & Related papers (2025-10-16T17:59:32Z)
Agentic Reinforcement Learning with Implicit Step Rewards [92.26560379363492]
Large language models (LLMs) are increasingly developed as autonomous agents using reinforcement learning (agentic RL)<n>We introduce implicit step rewards for agentic RL (iStar), a general credit-assignment strategy that integrates seamlessly with standard RL algorithms.<n>We evaluate our method on three challenging agent benchmarks, including WebShop and VisualSokoban, as well as open-ended social interactions with unverifiable rewards in SOTOPIA.
arXiv Detail & Related papers (2025-09-23T16:15:42Z)
Client-Centric Federated Adaptive Optimization [78.30827455292827]
Federated Learning (FL) is a distributed learning paradigm where clients collaboratively train a model while keeping their own data private.<n>We propose Federated-Centric Adaptive Optimization, which is a class of novel federated optimization approaches.
arXiv Detail & Related papers (2025-01-17T04:00:50Z)
On the Linear Speedup of Personalized Federated Reinforcement Learning with Shared Representations [15.549340968605234]
Federated reinforcement learning (FedRL) enables multiple agents to collaboratively learn a policy without sharing their local trajectories collected during agent-environment interactions.<n>We introduce a emphpersonalized FedRL framework (PFedRL) by taking advantage of possibly shared common structure among agents in heterogeneous environments.
arXiv Detail & Related papers (2024-11-22T15:42:43Z)
Enhancing Spectrum Efficiency in 6G Satellite Networks: A GAIL-Powered Policy Learning via Asynchronous Federated Inverse Reinforcement Learning [67.95280175998792]
A novel adversarial imitation learning (GAIL)-powered policy learning approach is proposed for optimizing beamforming, spectrum allocation, and remote user equipment (RUE) association ins. We employ inverse RL (IRL) to automatically learn reward functions without manual tuning. We show that the proposed MA-AL method outperforms traditional RL approaches, achieving a $14.6%$ improvement in convergence and reward value.
arXiv Detail & Related papers (2024-09-27T13:05:02Z)
Asynchronous Message-Passing and Zeroth-Order Optimization Based Distributed Learning with a Use-Case in Resource Allocation in Communication Networks [11.182443036683225]
Distributed learning and adaptation have received significant interest and found wide-ranging applications in machine learning signal processing.<n>This paper specifically focuses on a scenario where agents collaborate towards a common task.<n>Agents, acting as transmitters, collaboratively train their individual policies to maximize a global reward.
arXiv Detail & Related papers (2023-11-08T11:12:27Z)
Federated Natural Policy Gradient and Actor Critic Methods for Multi-task Reinforcement Learning [46.28771270378047]
Federated reinforcement learning (RL) enables collaborative decision making of multiple distributed agents without sharing local data trajectories. In this work, we consider a multi-task setting, in which each agent has its own private reward function corresponding to different tasks, while sharing the same transition kernel of the environment. We learn a globally optimal policy that maximizes the sum of the discounted total rewards of all the agents in a decentralized manner.
arXiv Detail & Related papers (2023-11-01T00:15:18Z)
Personalizing Federated Learning with Over-the-Air Computations [84.8089761800994]
Federated edge learning is a promising technology to deploy intelligence at the edge of wireless networks in a privacy-preserving manner. Under such a setting, multiple clients collaboratively train a global generic model under the coordination of an edge server. This paper presents a distributed training paradigm that employs analog over-the-air computation to address the communication bottleneck.
arXiv Detail & Related papers (2023-02-24T08:41:19Z)
Convergence Rates of Average-Reward Multi-agent Reinforcement Learning via Randomized Linear Programming [41.30044824711509]
We focus on the case that the global reward is a sum of local rewards, the joint policy factorizes into agents' marginals, and full state observability. We develop multi-agent extensions, whereby agents solve their local saddle point problems and then perform local weighted averaging. We establish that the sample complexity to obtain near-globally optimal solutions matches tight dependencies on the cardinality of the state and action spaces.
arXiv Detail & Related papers (2021-10-22T03:48:41Z)
Locality Matters: A Scalable Value Decomposition Approach for Cooperative Multi-Agent Reinforcement Learning [52.7873574425376]
Cooperative multi-agent reinforcement learning (MARL) faces significant scalability issues due to state and action spaces that are exponentially large in the number of agents. We propose a novel, value-based multi-agent algorithm called LOMAQ, which incorporates local rewards in the Training Decentralized Execution paradigm.
arXiv Detail & Related papers (2021-09-22T10:08:15Z)
OPIRL: Sample Efficient Off-Policy Inverse Reinforcement Learning via Distribution Matching [12.335788185691916]
Inverse Reinforcement Learning (IRL) is attractive in scenarios where reward engineering can be tedious. Prior IRL algorithms use on-policy transitions, which require intensive sampling from the current policy for stable and optimal performance. We present Off-Policy Inverse Reinforcement Learning (OPIRL), which adopts off-policy data distribution instead of on-policy.
arXiv Detail & Related papers (2021-09-09T14:32:26Z)
Exploiting Shared Representations for Personalized Federated Learning [54.65133770989836]
We propose a novel federated learning framework and algorithm for learning a shared data representation across clients and unique local heads for each client. Our algorithm harnesses the distributed computational power across clients to perform many local-updates with respect to the low-dimensional local parameters for every update of the representation. This result is of interest beyond federated learning to a broad class of problems in which we aim to learn a shared low-dimensional representation among data distributions.
arXiv Detail & Related papers (2021-02-14T05:36:25Z)
Multi-Center Federated Learning [62.57229809407692]
This paper proposes a novel multi-center aggregation mechanism for federated learning. It learns multiple global models from the non-IID user data and simultaneously derives the optimal matching between users and centers. Our experimental results on benchmark datasets show that our method outperforms several popular federated learning methods.
arXiv Detail & Related papers (2020-05-03T09:14:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.