Related papers: Game-invariant Features Through Contrastive and Domain-adversarial Learning

Game-invariant Features Through Contrastive and Domain-adversarial Learning

URL: http://arxiv.org/abs/2505.17328v1
Date: Thu, 22 May 2025 22:45:51 GMT
Title: Game-invariant Features Through Contrastive and Domain-adversarial Learning
Authors: Dylan Kline,
Abstract summary: Foundational game-image encoders often overfit to game-specific visual styles.<n>We present a method that combines contrastive learning and domain-adversarial training to learn game-invariant visual features.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Foundational game-image encoders often overfit to game-specific visual styles, undermining performance on downstream tasks when applied to new games. We present a method that combines contrastive learning and domain-adversarial training to learn game-invariant visual features. By simultaneously encouraging similar content to cluster and discouraging game-specific cues via an adversarial domain classifier, our approach produces embeddings that generalize across diverse games. Experiments on the Bingsu game-image dataset (10,000 screenshots from 10 games) demonstrate that after only a few training epochs, our model's features no longer cluster by game, indicating successful invariance and potential for improved cross-game transfer (e.g., glitch detection) with minimal fine-tuning. This capability paves the way for more generalizable game vision models that require little to no retraining on new games.

Related papers

Gameplay Highlights Generation [3.019500891118183]
This work enables gamers to share their gaming experience on social media by automatically generating eye-catching highlight reels from their gameplay session.<n>We develop an in-house gameplay event detection dataset containing interesting events annotated by humans using VIA video annotator.<n>We finetuned a multimodal general purpose video understanding model such as X-CLIP using our dataset which generalizes across multiple games in a genre without per game engineering.
arXiv Detail & Related papers (2025-05-12T16:28:22Z)
Towards General Game Representations: Decomposing Games Pixels into Content and Style [2.570570340104555]
Learning pixel representations of games can benefit artificial intelligence across several downstream tasks. This paper explores how generalizable pre-trained computer vision encoders can be for such tasks. We employ a pre-trained Vision Transformer encoder and a decomposition technique based on game genres to obtain separate content and style embeddings.
arXiv Detail & Related papers (2023-07-20T17:53:04Z)
On the Convergence of No-Regret Learning Dynamics in Time-Varying Games [89.96815099996132]
We characterize the convergence of optimistic gradient descent (OGD) in time-varying games. Our framework yields sharp convergence bounds for the equilibrium gap of OGD in zero-sum games. We also provide new insights on dynamic regret guarantees in static games.
arXiv Detail & Related papers (2023-01-26T17:25:45Z)
Game State Learning via Game Scene Augmentation [2.570570340104555]
We introduce a new game scene augmentation technique -- named GameCLR -- that takes advantage of the game-engine to define and synthesize specific, highly-controlled renderings of different game states. Our results suggest that GameCLR can infer the game's state information from game footage more accurately compared to the baseline.
arXiv Detail & Related papers (2022-07-04T09:40:14Z)
Multi-Game Decision Transformers [49.257185338595434]
We show that a single transformer-based model can play a suite of up to 46 Atari games simultaneously at close-to-human performance. We compare several approaches in this multi-game setting, such as online and offline RL methods and behavioral cloning. We find that our Multi-Game Decision Transformer models offer the best scalability and performance.
arXiv Detail & Related papers (2022-05-30T16:55:38Z)
Improving Transferability of Representations via Augmentation-Aware Self-Supervision [117.15012005163322]
AugSelf is an auxiliary self-supervised loss that learns the difference of augmentation parameters between two randomly augmented samples. Our intuition is that AugSelf encourages to preserve augmentation-aware information in learned representations, which could be beneficial for their transferability. AugSelf can easily be incorporated into recent state-of-the-art representation learning methods with a negligible additional training cost.
arXiv Detail & Related papers (2021-11-18T10:43:50Z)
Contrastive Learning of Generalized Game Representations [2.323282558557423]
Representing games through their pixels offers a promising approach for building general-purpose and versatile game models. While games are not merely images, neural network models trained on game pixels often capture differences of the visual style of the image rather than the content of the game. In this paper we build on recent advances in contrastive learning and showcase its benefits for representation learning in games.
arXiv Detail & Related papers (2021-06-18T11:17:54Z)
Unsupervised Visual Representation Learning by Tracking Patches in Video [88.56860674483752]
We propose to use tracking as a proxy task for a computer vision system to learn the visual representations. Modelled on the Catch game played by the children, we design a Catch-the-Patch (CtP) game for a 3D-CNN model to learn visual representations.
arXiv Detail & Related papers (2021-05-06T09:46:42Z)
Generating Gameplay-Relevant Art Assets with Transfer Learning [0.8164433158925593]
We propose a Convolutional Variational Autoencoder (CVAE) system to modify and generate new game visuals based on gameplay relevance. Our experimental results indicate that adopting a transfer learning approach can help to improve visual quality and stability over unseen data.
arXiv Detail & Related papers (2020-10-04T20:58:40Z)
Encoding Robustness to Image Style via Adversarial Feature Perturbations [72.81911076841408]
We adapt adversarial training by directly perturbing feature statistics, rather than image pixels, to produce robust models. Our proposed method, Adversarial Batch Normalization (AdvBN), is a single network layer that generates worst-case feature perturbations during training.
arXiv Detail & Related papers (2020-09-18T17:52:34Z)
Watching the World Go By: Representation Learning from Unlabeled Videos [78.22211989028585]
Recent single image unsupervised representation learning techniques show remarkable success on a variety of tasks. In this paper, we argue that videos offer this natural augmentation for free. We propose Video Noise Contrastive Estimation, a method for using unlabeled video to learn strong, transferable single image representations.
arXiv Detail & Related papers (2020-03-18T00:07:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.