I have rolled out a few new graphics as part of my post-match batch that are created as standard. The one that is getting the most attention is the ‘Deserve’ to Win-O-Meter.
This is understandable. I will readily admit that deserve very often has little to do with what actually happens in sports (I wrote it right on the graphic as well). What this is better thought of as a measure of where in a distribution of possible outcomes the result lies if we hold everything else constant expect for the number of goals scored.
There is no way to fully counterfactual how everything else plays out (if you want that you might as well just stick with pre-match odds). We know goals, red cards, injuries and a list of other things change games and it is impossible to account for them. What this does is instead just take the events that happen as a given and try and use that to give a sense of how often different results happen.
This is not a new idea; the idea of expected points, or win probability is something that has been around for a while. This takes the same principles and adds a bit more information from the game beyond just the quality of the shots to estimate the same thing and in my opinion, presents it in a pretty format.
The format is adapted from hockey [Moneypuck.com] where this type of graphic is produced regularly as well.
How it is calculated
I am going to use Arsenal’s loss against West Ham as an example here to go through how the final numbers are produced.
For this I take xG, post-shot xG (where a shot is on frame), and non-shot xG as my basis for calculating the performances for both teams.
For xG this is exactly the same calculation that goes into classic expected points. This is already part of my standard graphics but it has a less prominent placement on the graphics.
What this does is take the xG for each shot, randomly select a value between the low and high estimate and then generate another random number between 0 and 1 and if the random number is below the xG value it is recorded as a goal and if the random number is above it is recorded as not a goal.
This is done for both teams shots and the total number of goals for each team is recorded and compared to the other team for if the match with these shots was a win, loss, or draw.
This is done a total of 10,000 times.
Next is post shot xG and this is done in a similar way but uses the estimate of the post shot xG for the chances rather than just where they were taken from. This will account for a team missing the target completely or situations where lots of shots are blocked or a keeper has a heater and saves a ton of shots.
In this match we are looking at there was a bit of all of those happening. Arsenal had a lot of shots (30) but only put 8 of those on target, from those the placement wasn’t really all that great but it did require the keeper to have a good game to keep all of them out.
West Ham’s situation was a bit of the opposite. They had just 6 shots but put 3 of them on target with their 3 shots combined for a higher total post shot xG value than Arsenal’s 8 attempts.
The calculation for post-shot xG is the exact same with one change, the confidence on the estimate of the true value of the shot on target value is higher. Rather than using +/- 15% it is upped to +/- 25% capped at 99%. This is also done 10,000 times to come to an estimate of win, loss, and draw.
Lastly, I use non-shot xG. I use this because not every action ends in a shot and there are potentially dangerous situations that will go uncounted if you rely only on event that end in a shot.
Non-shot xG is at it’s core an estimate of the value of ball progression. Getting the ball closer to the goal that you want to score at is more valuable and will lead to scoring more often.
This takes the highest value produced in a possession sequence as the non-shot value created and uses this to go through the same process that have been used for shots and post-shot xG.
All of these estimates for win, loss, and draw are then combined and then go into a final number that is presented in the ‘Deserve’ to Win-O-Meter.
For this match, the final number is lower than the raw expected points. This is because the shots on target favored West-Ham and because while the non-shot xG favored Arsenal, it was lots of little values of getting the ball into the box but not quite into the really dangerous locations.
Final thoughts and usage
In my mind, this is still a toy rather than something super serious. It is fun to estimate things that have driven conversations for so long. Arguing over who deserved to win, or which team was better on the day is probably as old as the sport itself.
These don’t stop that from happening but it is like the numbers that it is derived from an attempt to take the age old arguments and do a more formalized estimate.
I have long said that stats don’t end a conversation (even if people will use it that way) but rather as a basis for further discussion and exploration. This graphic and the numbers that go into it shouldn’t be any different.
Let me know if there are any further questions here or things that can use further clarifications.
I wonder if there is a way to factor in goals as part of the path dependence of the outcomes. an obvious instance in the actual game might be if someone misses a penalty (whatever xG that is, maybe 0.8) and someone taps it in as a follow up (lets say xG 0.9). Then that's 72% chance of two goals appearing in each simulation, even though that couldn't occur in real life. Obviously this is an issue with xG too, but maybe a weighted reduction in the simulator (increase RNG range ?) immediately following a simulated goal
"All of these estimates for win, loss, and draw are then combined and then go into a final number that is presented in the ‘Deserve’ to Win-O-Meter."
How are these estimates combined? Is the average the outcome of each simulation?