Related papers: A Survey on Self-play Methods in Reinforcement Learning

A Survey on Self-play Methods in Reinforcement Learning

URL: http://arxiv.org/abs/2408.01072v1
Date: Fri, 2 Aug 2024 07:47:51 GMT
Title: A Survey on Self-play Methods in Reinforcement Learning
Authors: Ruize Zhang, Zelai Xu, Chengdong Ma, Chao Yu, Wei-Wei Tu, Shiyu Huang, Deheng Ye, Wenbo Ding, Yaodong Yang, Yu Wang,
Abstract summary: Self-play, characterized by agents' interactions with copies or past versions of itself, has recently gained prominence in reinforcement learning. This paper first clarifies the preliminaries of self-play, including the multi-agent reinforcement learning framework and basic game theory concepts. It provides a unified framework and classifies existing self-play algorithms within this framework.
Score: 30.17222344626277
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Self-play, characterized by agents' interactions with copies or past versions of itself, has recently gained prominence in reinforcement learning. This paper first clarifies the preliminaries of self-play, including the multi-agent reinforcement learning framework and basic game theory concepts. Then it provides a unified framework and classifies existing self-play algorithms within this framework. Moreover, the paper bridges the gap between the algorithms and their practical implications by illustrating the role of self-play in different scenarios. Finally, the survey highlights open challenges and future research directions in self-play. This paper is an essential guide map for understanding the multifaceted landscape of self-play in RL.

Related papers

Playpen: An Environment for Exploring Learning Through Conversational Interaction [81.67330926729015]
We investigate whether Dialogue Games can also serve as a source of feedback signals for learning.<n>We introduce Playpen, an environment for off- and online learning through Dialogue Game self-play.<n>We find that imitation learning through SFT improves performance on unseen instances, but negatively impacts other skills.
arXiv Detail & Related papers (2025-04-11T14:49:33Z)
How do Large Language Models Understand Relevance? A Mechanistic Interpretability Perspective [64.00022624183781]
Large language models (LLMs) can assess relevance and support information retrieval (IR) tasks. We investigate how different LLM modules contribute to relevance judgment through the lens of mechanistic interpretability.
arXiv Detail & Related papers (2025-04-10T16:14:55Z)
Decorrelation-based Self-Supervised Visual Representation Learning for Writer Identification [10.55096104577668]
We explore the decorrelation-based paradigm of self-supervised learning and apply the same to learning disentangled stroke features for writer identification. We show that the proposed framework outperforms the contemporary self-supervised learning framework on the writer identification benchmark. To the best of our knowledge, this work is the first of its kind to apply self-supervised learning for learning representations for writer verification tasks.
arXiv Detail & Related papers (2024-10-02T11:43:58Z)
A Probabilistic Model Behind Self-Supervised Learning [53.64989127914936]
In self-supervised learning (SSL), representations are learned via an auxiliary task without annotated labels. We present a generative latent variable model for self-supervised learning. We show that several families of discriminative SSL, including contrastive methods, induce a comparable distribution over representations.
arXiv Detail & Related papers (2024-02-02T13:31:17Z)
Semi-supervised learning made simple with self-supervised clustering [65.98152950607707]
Self-supervised learning models have been shown to learn rich visual representations without requiring human annotations. We propose a conceptually simple yet empirically powerful approach to turn clustering-based self-supervised methods into semi-supervised learners.
arXiv Detail & Related papers (2023-06-13T01:09:18Z)
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning [140.96990096377127]
We introduce self-distillation and online clustering for self-supervised speech representation learning (DinoSR) DinoSR first extracts contextualized embeddings from the input audio with a teacher network, then runs an online clustering system on the embeddings to yield a machine-discovered phone inventory, and finally uses the discretized tokens to guide a student network. We show that DinoSR surpasses previous state-of-the-art performance in several downstream tasks, and provide a detailed analysis of the model and the learned discrete units.
arXiv Detail & Related papers (2023-05-17T07:23:46Z)
Self-Supervised Multi-Object Tracking For Autonomous Driving From Consistency Across Timescales [53.55369862746357]
Self-supervised multi-object trackers have tremendous potential as they enable learning from raw domain-specific data. However, their re-identification accuracy still falls short compared to their supervised counterparts. We propose a training objective that enables self-supervised learning of re-identification features from multiple sequential frames.
arXiv Detail & Related papers (2023-04-25T20:47:29Z)
SS-VAERR: Self-Supervised Apparent Emotional Reaction Recognition from Video [61.21388780334379]
This work focuses on the apparent emotional reaction recognition from the video-only input, conducted in a self-supervised fashion. The network is first pre-trained on different self-supervised pretext tasks and later fine-tuned on the downstream target task.
arXiv Detail & Related papers (2022-10-20T15:21:51Z)
Leveraging Explanations in Interactive Machine Learning: An Overview [10.284830265068793]
Explanations have gained an increasing level of interest in the AI and Machine Learning (ML) communities. This paper presents an overview of research where explanations are combined with interactive capabilities.
arXiv Detail & Related papers (2022-07-29T07:46:11Z)
Self-Regulated Learning for Egocentric Video Activity Anticipation [147.9783215348252]
Self-Regulated Learning (SRL) aims to regulate the intermediate representation consecutively to produce representation that emphasizes the novel information in the frame of the current time-stamp. SRL sharply outperforms existing state-of-the-art in most cases on two egocentric video datasets and two third-person video datasets.
arXiv Detail & Related papers (2021-11-23T03:29:18Z)
A Survey on Contrastive Self-supervised Learning [0.0]
Self-supervised learning has gained popularity because of its ability to avoid the cost of annotating large-scale datasets. Contrastive learning has recently become a dominant component in self-supervised learning methods for computer vision, natural language processing (NLP), and other domains. This paper provides an extensive review of self-supervised methods that follow the contrastive approach.
arXiv Detail & Related papers (2020-10-31T21:05:04Z)
Deep Reinforcement Learning with Stacked Hierarchical Attention for Text-based Games [64.11746320061965]
We study reinforcement learning for text-based games, which are interactive simulations in the context of natural language. We aim to conduct explicit reasoning with knowledge graphs for decision making, so that the actions of an agent are generated and supported by an interpretable inference procedure. We extensively evaluate our method on a number of man-made benchmark games, and the experimental results demonstrate that our method performs better than existing text-based agents.
arXiv Detail & Related papers (2020-10-22T12:40:22Z)
Self-supervised Learning from a Multi-view Perspective [121.63655399591681]
We show that self-supervised representations can extract task-relevant information and discard task-irrelevant information. Our theoretical framework paves the way to a larger space of self-supervised learning objective design.
arXiv Detail & Related papers (2020-06-10T00:21:35Z)
A Comparison of Self-Play Algorithms Under a Generalized Framework [4.339542790745868]
The notion of self-play, albeit often cited in multiagent Reinforcement Learning, has never been grounded in a formal model. We present a formalized framework, with clearly defined assumptions, which encapsulates the meaning of self-play. We measure how well a subset of the captured self-play methods approximate this solution when paired with the famous PPO algorithm.
arXiv Detail & Related papers (2020-06-08T11:02:37Z)
Modeling Document Interactions for Learning to Rank with Regularized Self-Attention [22.140197412459393]
We explore modeling documents interactions with self-attention based neural networks. We propose simple yet effective regularization terms designed to model interactions between documents. We show that training self-attention network with our proposed regularization terms can significantly outperform existing learning to rank methods.
arXiv Detail & Related papers (2020-05-08T09:53:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.