What If They Took the Shot? A Hierarchical Bayesian Framework for Counterfactual Expected Goals
- URL: http://arxiv.org/abs/2511.23072v1
- Date: Fri, 28 Nov 2025 11:01:47 GMT
- Title: What If They Took the Shot? A Hierarchical Bayesian Framework for Counterfactual Expected Goals
- Authors: Mikayil Mahmudlu, Oktay Karakuş, Hasan Arkadaş,
- Abstract summary: This study develops a hierarchical Bayesian framework to quantify player-specific effects in expected goals (xG) estimation.<n>Using 9,970 shots from StatsBomb's 2015-16 data and Football Manager 2017 ratings, we combine Bayesian logistic regression with informed priors to stabilise player-level estimates.<n>The framework supports counterfactual "what-if" analysis by reallocating shots between players under identical contexts.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This study develops a hierarchical Bayesian framework that integrates expert domain knowledge to quantify player-specific effects in expected goals (xG) estimation, addressing a limitation of standard models that treat all players as identical finishers. Using 9,970 shots from StatsBomb's 2015-16 data and Football Manager 2017 ratings, we combine Bayesian logistic regression with informed priors to stabilise player-level estimates, especially for players with few shots. The hierarchical model reduces posterior uncertainty relative to weak priors and achieves strong external validity: hierarchical and baseline predictions correlate at R2 = 0.75, while an XGBoost benchmark validated against StatsBomb xG reaches R2 = 0.833. The model uncovers interpretable specialisation profiles, including one-on-one finishing (Aguero, Suarez, Belotti, Immobile, Martial), long-range shooting (Pogba), and first-touch execution (Insigne, Salah, Gameiro). It also identifies latent ability in underperforming players such as Immobile and Belotti. The framework supports counterfactual "what-if" analysis by reallocating shots between players under identical contexts. Case studies show that Sansone would generate +2.2 xG from Berardi's chances, driven largely by high-pressure situations, while Vardy-Giroud substitutions reveal strong asymmetry: replacing Vardy with Giroud results in a large decline (about -7 xG), whereas the reverse substitution has only a small effect (about -1 xG). This work provides an uncertainty-aware tool for player evaluation, recruitment, and tactical planning, and offers a general approach for domains where individual skill and contextual factors jointly shape performance.
Related papers
- Beyond Expected Goals: A Probabilistic Framework for Shot Occurrences in Soccer [0.9940728137241215]
Expected goals (xG) models estimate the probability that a shot results in a goal from its context, but they operate only on observed shots.<n>We propose xG+, a framework that first estimates the probability that a shot occurs within the next second and its corresponding xG if it were to occur.<n>We show that this improves predictive accuracy at the team level and produces a more persistent player skill signal than standard xG models.
arXiv Detail & Related papers (2025-11-28T20:59:29Z) - Analysis of Line Break prediction models for detecting defensive breakthrough in football [0.0]
In football, attacking teams attempt to break through the opponent's defensive line to create scoring opportunities.<n>This study develops a machine learning model to predict Line Breaks using event and tracking data from the 2023 J1 League season.
arXiv Detail & Related papers (2025-10-31T06:42:20Z) - Shoot First, Ask Questions Later? Building Rational Agents that Explore and Act Like People [81.63702981397408]
Given limited resources, to what extent do agents based on language models (LMs) act rationally?<n>We develop methods to benchmark and enhance agentic information-seeking, drawing on insights from human behavior.<n>For Spotter agents, our approach boosts accuracy by up to 14.7% absolute over LM-only baselines; for Captain agents, it raises expected information gain (EIG) by up to 0.227 bits (94.2% of the achievable noise ceiling)
arXiv Detail & Related papers (2025-10-23T17:57:28Z) - Infinite Ends from Finite Samples: Open-Ended Goal Inference as Top-Down Bayesian Filtering of Bottom-Up Proposals [48.437581268398866]
We introduce a sequential Monte Carlo model of open-ended goal inference.
We validate this model in a goal inference task called Block Words.
Our experiments highlight the importance of uniting top-down and bottom-up models for explaining the speed, accuracy, and generality of human theory-of-mind.
arXiv Detail & Related papers (2024-07-23T18:04:40Z) - Biases in Expected Goals Models Confound Finishing Ability [18.67526513350852]
Expected Goals (xG) has emerged as a popular tool for evaluating finishing skill in soccer analytics.
This paper aims to address the limitations and nuances surrounding the evaluation of finishing skill using xG statistics.
arXiv Detail & Related papers (2024-01-18T12:41:58Z) - Bayes-xG: Player and Position Correction on Expected Goals (xG) using
Bayesian Hierarchical Approach [55.2480439325792]
This study investigates the influence of player or positional factors in predicting a shot resulting in a goal, measured by the expected goals (xG) metric.
It uses publicly available data from StatsBomb to analyse 10,000 shots from the English Premier League.
The study extends its analysis to data from Spain's La Liga and Germany's Bundesliga, yielding comparable results.
arXiv Detail & Related papers (2023-11-22T21:54:02Z) - About latent roles in forecasting players in team sports [47.066729480128856]
Team sports contain a significant social component that influences interactions between teammates and opponents.
We create RolFor, a novel end-to-end model for Role-based Forecasting.
arXiv Detail & Related papers (2023-04-17T13:33:23Z) - A Machine Learning Approach for Player and Position Adjusted Expected
Goals in Football (Soccer) [0.0]
Expected Goals (xG) allow further insight than just a scoreline.
This paper uses machine learning applications that are developed and applied to Football Event data.
The model successfully predicts xGs probability values for football players based on 15,575 shots.
arXiv Detail & Related papers (2023-01-19T22:17:38Z) - Collusion Detection in Team-Based Multiplayer Games [57.153233321515984]
We propose a system that detects colluding behaviors in team-based multiplayer games.
The proposed method analyzes the players' social relationships paired with their in-game behavioral patterns.
We then automate the detection using Isolation Forest, an unsupervised learning technique specialized in highlighting outliers.
arXiv Detail & Related papers (2022-03-10T02:37:39Z) - CommonsenseQA 2.0: Exposing the Limits of AI through Gamification [126.85096257968414]
We construct benchmarks that test the abilities of modern natural language understanding models.
In this work, we propose gamification as a framework for data construction.
arXiv Detail & Related papers (2022-01-14T06:49:15Z) - Revisiting Membership Inference Under Realistic Assumptions [87.13552321332988]
We study membership inference in settings where some of the assumptions typically used in previous research are relaxed.
This setting is more realistic than the balanced prior setting typically considered by researchers.
We develop a new inference attack based on the intuition that inputs corresponding to training set members will be near a local minimum in the loss function.
arXiv Detail & Related papers (2020-05-21T20:17:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.