Related papers: Unbiased Self-Play

Related papers

Evaluating Social Bias in RAG Systems: When External Context Helps and Reasoning Hurts [7.344577590113121]
Social biases inherent in large language models (LLMs) raise significant fairness concerns.<n>This work focuses on evaluating and understanding the social bias implications of RAG.
arXiv Detail & Related papers (2026-02-10T06:27:56Z)
Making Bias Non-Predictive: Training Robust LLM Judges via Reinforcement Learning [91.8584139564909]
Large language models (LLMs) increasingly serve as automated judges, yet they remain susceptible to cognitive biases.<n>We propose Epistemic Independence Training (EIT), a reinforcement learning framework grounded in a key principle.<n>EIT operationalizes this through a balanced conflict strategy where bias signals are equally likely to support correct and incorrect answers.
arXiv Detail & Related papers (2026-02-02T01:43:48Z)
SRNN: Spatiotemporal Relational Neural Network for Intuitive Physics Understanding [5.9229807497571665]
This paper introduces the Spatiotemporal Network (SRNN), a model that establishes a unified representation for neural object attributes, relations and timeline.<n>On the CLEVR benchmark, SRNN achieves competitive performance, thereby confirming its capability to represent essential language relations from the visual stream.<n>Our work provides a proof-of-concept that confirms the viability of translating key neural intelligence into engineered systems for intuitive physics understanding in constrained environments.
arXiv Detail & Related papers (2025-11-10T06:43:42Z)
Automated Cyber Defense with Generalizable Graph-based Reinforcement Learning Agents [7.45063623129985]
Deep reinforcement learning is emerging as a viable strategy for automated cyber defense.<n>In this work, we frame ACD as a two-player context-based partially observable Markov decision problem.<n>We show that this approach outperforms the state-of-the-art by a wide margin.
arXiv Detail & Related papers (2025-09-19T16:57:27Z)
Collaborative Value Function Estimation Under Model Mismatch: A Federated Temporal Difference Analysis [55.13545823385091]
Federated reinforcement learning (FedRL) enables collaborative learning while preserving data privacy by preventing direct data exchange between agents. In real-world applications, each agent may experience slightly different transition dynamics, leading to inherent model mismatches. We show that even moderate levels of information sharing can significantly mitigate environment-specific errors.
arXiv Detail & Related papers (2025-03-21T18:06:28Z)
On Multi-Agent Inverse Reinforcement Learning [8.284137254112848]
We extend the Inverse Reinforcement Learning (IRL) framework to the multi-agent setting, assuming to observe agents who are following Nash Equilibrium (NE) policies. We provide an explicit characterization of the feasible reward set and analyze how errors in estimating the transition dynamics and expert behavior impact the recovered rewards.
arXiv Detail & Related papers (2024-11-22T16:31:36Z)
Disentangling Representations through Multi-task Learning [0.0]
We provide experimental and theoretical results guaranteeing the emergence of disentangled representations in agents that optimally solve classification tasks. We experimentally validate these predictions in RNNs trained on multi-task classification. We find that transformers are particularly suited for disentangling representations, which might explain their unique world understanding abilities.
arXiv Detail & Related papers (2024-07-15T21:32:58Z)
Problem-Solving in Language Model Networks [44.99833362998488]
This work extends the concept of multi-agent debate to more general network topologies. It measures the question-answering accuracy, influence, consensus, and the effects of bias on the collective.
arXiv Detail & Related papers (2024-06-18T07:59:14Z)
Self-Supervised Learning for Covariance Estimation [3.04585143845864]
We propose to globally learn a neural network that will then be applied locally at inference time. The architecture is based on the popular attention mechanism. It can be pre-trained as a foundation model and then be repurposed for various downstream tasks, e.g., adaptive target detection in radar or hyperspectral imagery.
arXiv Detail & Related papers (2024-03-13T16:16:20Z)
Partially Observable Stochastic Games with Neural Perception Mechanisms [31.51588071503617]
We propose the model of neuro-symbolic partially-observable games (NS-POSGs) We focus on a one-sided setting with a partially-informed agent using discrete, data-driven observations and another, fully-informed agent. We present a new method, called one-sided NS-HSVI, for approximate solution of one-sided NS-POSGs.
arXiv Detail & Related papers (2023-10-17T20:25:40Z)
Networked Communication for Decentralised Agents in Mean-Field Games [59.01527054553122]
We introduce networked communication to the mean-field game framework. We prove that our architecture has sample guarantees bounded between those of the centralised- and independent-learning cases.
arXiv Detail & Related papers (2023-06-05T10:45:39Z)
Self-supervised debiasing using low rank regularization [59.84695042540525]
Spurious correlations can cause strong biases in deep neural networks, impairing generalization ability. We propose a self-supervised debiasing framework potentially compatible with unlabeled samples. Remarkably, the proposed debiasing framework significantly improves the generalization performance of self-supervised learning baselines.
arXiv Detail & Related papers (2022-10-11T08:26:19Z)
Unsupervised Learning of Unbiased Visual Representations [10.871587311621974]
Deep neural networks are known for their inability to learn robust representations when biases exist in the dataset. We propose a fully unsupervised debiasing framework, consisting of three steps. We employ state-of-the-art supervised debiasing techniques to obtain an unbiased model.
arXiv Detail & Related papers (2022-04-26T10:51:50Z)
Stochastic Coherence Over Attention Trajectory For Continuous Learning In Video Streams [64.82800502603138]
This paper proposes a novel neural-network-based approach to progressively and autonomously develop pixel-wise representations in a video stream. The proposed method is based on a human-like attention mechanism that allows the agent to learn by observing what is moving in the attended locations. Our experiments leverage 3D virtual environments and they show that the proposed agents can learn to distinguish objects just by observing the video stream.
arXiv Detail & Related papers (2022-04-26T09:52:31Z)
On Information Asymmetry in Competitive Multi-Agent Reinforcement Learning: Convergence and Optimality [78.76529463321374]
We study the system of interacting non-cooperative two Q-learning agents. We show that this information asymmetry can lead to a stable outcome of population learning.
arXiv Detail & Related papers (2020-10-21T11:19:53Z)
Attention or memory? Neurointerpretable agents in space and time [0.0]
We design a model incorporating a self-attention mechanism that implements task-state representations in semantic feature-space. To evaluate the agent's selective properties, we add a large volume of task-irrelevant features to observations. In line with neuroscience predictions, self-attention leads to increased robustness to noise compared to benchmark models.
arXiv Detail & Related papers (2020-07-09T15:04:26Z)
Learning from Failure: Training Debiased Classifier from Biased Classifier [76.52804102765931]
We show that neural networks learn to rely on spurious correlation only when it is "easier" to learn than the desired knowledge. We propose a failure-based debiasing scheme by training a pair of neural networks simultaneously. Our method significantly improves the training of the network against various types of biases in both synthetic and real-world datasets.
arXiv Detail & Related papers (2020-07-06T07:20:29Z)
Maximizing Information Gain in Partially Observable Environments via Prediction Reward [64.24528565312463]
This paper tackles the challenge of using belief-based rewards for a deep RL agent. We derive the exact error between negative entropy and the expected prediction reward. This insight provides theoretical motivation for several fields using prediction rewards.
arXiv Detail & Related papers (2020-05-11T08:13:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.