Related papers: X-Ego: Acquiring Team-Level Tactical Situational Awareness via Cross-Egocentric Contrastive Video Representation Learning

X-Ego: Acquiring Team-Level Tactical Situational Awareness via Cross-Egocentric Contrastive Video Representation Learning

URL: http://arxiv.org/abs/2510.19150v1
Date: Wed, 22 Oct 2025 00:48:35 GMT
Title: X-Ego: Acquiring Team-Level Tactical Situational Awareness via Cross-Egocentric Contrastive Video Representation Learning
Authors: Yunzhe Wang, Soham Hans, Volkan Ustun,
Abstract summary: We introduce X-Ego-CS, a benchmark dataset consisting of 124 hours of gameplay footage from 45 professional-level matches of the popular e-sports game Counter-Strike 2.<n>X-Ego-CS provides cross-egocentric video streams that synchronously capture all players' first-person perspectives along with state-action trajectories.<n>We propose Cross-Ego Contrastive Learning ( CECL), which aligns teammates' egocentric visual streams to foster team-level situational awareness from an individual's perspective.
Score: 1.1765015608581086
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Human team tactics emerge from each player's individual perspective and their ability to anticipate, interpret, and adapt to teammates' intentions. While advances in video understanding have improved the modeling of team interactions in sports, most existing work relies on third-person broadcast views and overlooks the synchronous, egocentric nature of multi-agent learning. We introduce X-Ego-CS, a benchmark dataset consisting of 124 hours of gameplay footage from 45 professional-level matches of the popular e-sports game Counter-Strike 2, designed to facilitate research on multi-agent decision-making in complex 3D environments. X-Ego-CS provides cross-egocentric video streams that synchronously capture all players' first-person perspectives along with state-action trajectories. Building on this resource, we propose Cross-Ego Contrastive Learning (CECL), which aligns teammates' egocentric visual streams to foster team-level tactical situational awareness from an individual's perspective. We evaluate CECL on a teammate-opponent location prediction task, demonstrating its effectiveness in enhancing an agent's ability to infer both teammate and opponent positions from a single first-person view using state-of-the-art video encoders. Together, X-Ego-CS and CECL establish a foundation for cross-egocentric multi-agent benchmarking in esports. More broadly, our work positions gameplay understanding as a testbed for multi-agent modeling and tactical learning, with implications for spatiotemporal reasoning and human-AI teaming in both virtual and real-world domains. Code and dataset are available at https://github.com/HATS-ICT/x-ego.

Related papers

SoccerMaster: A Vision Foundation Model for Soccer Understanding [50.88251190999469]
Soccer understanding has recently garnered growing research interest due to its domain-specific complexity and unique challenges.<n>This work aims to propose a unified model to handle diverse soccer visual understanding tasks, ranging from fine-grained perception to semantic reasoning.<n>We present SoccerMaster, the first soccer-specific vision foundation model that unifies diverse understanding tasks within a single framework.
arXiv Detail & Related papers (2025-12-11T18:03:30Z)
Learning Skill-Attributes for Transferable Assessment in Video [56.813876909367856]
Skill assessment from video entails rating the quality of a person's physical performance and explaining what could be done better.<n>Our CrossTrainer approach discovers skill-attributes, such as balance, control, and hand positioning.<n>By abstracting out the shared behaviors indicative of human skill, the proposed video representation generalizes substantially better than an array of existing techniques.
arXiv Detail & Related papers (2025-11-17T23:53:06Z)
Player-Team Heterogeneous Interaction Graph Transformer for Soccer Outcome Prediction [8.197004730382396]
HIGFormer is a novel graph-augmented transformer-based deep learning model for soccer outcome prediction.<n>It captures both fine-grained player dynamics and high-level team interactions.<n>Experiments on the WyScout Open Access dataset, a large-scale real-world soccer dataset, demonstrate that HIGFormer significantly outperforms existing methods in prediction accuracy.
arXiv Detail & Related papers (2025-07-14T06:43:36Z)
Generalizable Agent Modeling for Agent Collaboration-Competition Adaptation with Multi-Retrieval and Dynamic Generation [19.74776726500979]
Adapting a single agent to a new multi-agent system brings challenges, necessitating adjustments across various tasks, environments, and interactions with unknown teammates and opponents.<n>We propose a more comprehensive setting, Agent Collaborative-Competitive Adaptation, which evaluates an agent to generalize across diverse scenarios.<n>In ACCA, agents adjust to task and environmental changes, collaborate with unseen teammates, and compete against unknown opponents.
arXiv Detail & Related papers (2025-06-20T03:28:18Z)
Diffusion Stochastic Learning Over Adaptive Competing Networks [28.974218453862825]
This paper studies a dynamic game between two competing teams, each consisting of a network of collaborating agents.<n>We propose diffusion learning algorithms to address two important classes of this network game.
arXiv Detail & Related papers (2025-04-28T09:49:54Z)
From Broadcast to Minimap: Achieving State-of-the-Art SoccerNet Game State Reconstruction [0.19748373512880277]
Game State Reconstruction (GSR) involves precise tracking and localization of all individuals on the football field-players, goalkeepers, referees, and others.<n>This capability enables coaches and analysts to derive actionable insights into player movements, team formations, and game dynamics.<n>We present a robust end-to-end pipeline for tracking players across an entire match using a single-camera setup.
arXiv Detail & Related papers (2025-04-08T18:10:44Z)
AVA: Attentive VLM Agent for Mastering StarCraft II [56.07921367623274]
We introduce Attentive VLM Agent (AVA), a multimodal StarCraft II agent that aligns artificial agent perception with the human gameplay experience.<n>Our agent addresses this limitation by incorporating RGB visual inputs and natural language observations that more closely simulate human cognitive processes during gameplay.
arXiv Detail & Related papers (2025-03-07T12:54:25Z)
Learning Robot Soccer from Egocentric Vision with Deep Reinforcement Learning [17.906144781244336]
We train end-to-end robot soccer policies with fully onboard computation and sensing via egocentric RGB vision. This paper constitutes a first demonstration of end-to-end training for multi-agent robot soccer.
arXiv Detail & Related papers (2024-05-03T18:41:13Z)
Collusion Detection in Team-Based Multiplayer Games [57.153233321515984]
We propose a system that detects colluding behaviors in team-based multiplayer games. The proposed method analyzes the players' social relationships paired with their in-game behavioral patterns. We then automate the detection using Isolation Forest, an unsupervised learning technique specialized in highlighting outliers.
arXiv Detail & Related papers (2022-03-10T02:37:39Z)
From Motor Control to Team Play in Simulated Humanoid Football [56.86144022071756]
We train teams of physically simulated humanoid avatars to play football in a realistic virtual environment. In a sequence of stages, players first learn to control a fully articulated body to perform realistic, human-like movements. They then acquire mid-level football skills such as dribbling and shooting. Finally, they develop awareness of others and play as a team, bridging the gap between low-level motor control at a timescale of milliseconds.
arXiv Detail & Related papers (2021-05-25T20:17:10Z)
Coach-Player Multi-Agent Reinforcement Learning for Dynamic Team Composition [88.26752130107259]
In real-world multiagent systems, agents with different capabilities may join or leave without altering the team's overarching goals. We propose COPA, a coach-player framework to tackle this problem. We 1) adopt the attention mechanism for both the coach and the players; 2) propose a variational objective to regularize learning; and 3) design an adaptive communication method to let the coach decide when to communicate with the players.
arXiv Detail & Related papers (2021-05-18T17:27:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.