This is not a typical Expecting Goals post. For one, it has gone out in full to all readers of Expecting Goals. Hi everyone! And most notably, it is not in itself a larger study of complex topics in football analytics. This will be a short work of commentary about teams in the 2024-25 Premier League based on a toy model I built.
As I’ll explain below, this will be a regular addition to your Expecting Goals content. When I have some stuff to say about soccer that is not a fully worked-out study, I will post it here. As the season goes on, we can track the power rankings in the Premier League, and perhaps other top European leagues, and see how they hold up and what they have taught us.
Power Rankings
These power rankings are based on a combination of expected goals and actual goals, only considering 11v11 minutes in the Premier League, and adjusted for schedule difficulty. They are based on ongoing, not-completed work on team valuation and I present them as a toy, not a fully worked-out model of true team quality.1
There are six columns.
axGD/M: My 24-25 estimated expected goals difference for the team in 11v11 situations, adjusted for strength of schedule, includes a small adjustment for actual non-penalty goals scored and conceded. The weight on actual goals will increase over the season.
Att Rating: Adjusted attacking performance in 2024-25 above or below league, scaled to league average = 1 (so 143% is 43% more expected goals production than league average)
Def Rating: Adjusted defensive performance in 2024-25 above or below league, scaled to league average = 1 (so 67% means conceding about two-thirds as much expected goals production as an average team). Lower is better here.
SoS xGD: The expected goals difference per match that an average team would be projected to have against their schedule based on my team ratings. So +0.11 means an easier schedule, an average team would be projected to have a +0.11 xGD against this schedule
Proj G/Mt: These are my underlying team attacking ratings for the projections, based on 23-24 and 24-25 data, a projection of goals per match. Because they incorporate 23-24 data (with a weight that decreases over the season), these are much less responsive to performance in individual matches. Note how Chelsea rank well above Manchester City in performance this season, but still have notably worse Projected G/Mt and GA/Mt going forward.
Proj GA/Mt: These are my underlying team defensive ratings for the projections, based on 23-24 and 24-25 data, a projection of goals conceded per match.
Commentary
When you look at just 11v11 minutes, Arsenal stand out as the best team in the league. The gap to Liverpool is not so wide that it is impossible Liverpool could overtake them in time, but on the matches we have seen so far, Arsenal have done the better job creating chances for themselves and preventing chances for the opposition. They have, obviously, received several red cards this season and less obviously failed to take points from close matches where last season the bounces tended to go their way. Liverpool by contrast have been much more efficient at translating modestly superior play into points, and this is the source of their large lead in the table.
At some point I hope to build a projection system that I like enough to share here. Based on the toy projections I have at the moment, Liverpool would be solid favorites for the title despite landing second in these power rankings.
For Manchester City, the result against Brighton seems to have finally focused the media on a clear finding in the underlying numbers. Not since Pep Guardiola’s first season in charge has Manchester City not been 1 or 1a in any advanced statistical power ranking of the Premier League. But with Rodri and Kevin De Bruyne out and Julian Alvarez gone and inexplicably not replaced, City have been playing merely solid top-four football. We are all waiting for Pep to conjure up some new tactics and turn it around but he has not done it so far.
Chelsea have clearly been the best team from outside the expected top three, and their performance is even more impressive because they have put up these numbers against one of the toughest schedules in the Premier League so far.
If the Premier League gets only four places in the Champions League, we have increasing reason to think the chase will not be terribly exciting. But a big pile-up of good-not-great teams outside the top four suggests that if this season, teams from the PL can do well enough in Europe to gain a fifth qualifying place, anything is possible.
Manchester United. No one who follows soccer through an analytics lens could possibly have been surprised by their poor start. I think one notable fact here is that their problems appear to be concentrated on the attacking side. The signings of Matthijs De Ligt and Noussair Mazraoui stabilized the back line, but there just are not enough shots in that attack. I am skeptical that a new manager will be able to find a lot of new attacking firepower in a collection of players who rarely attempt three or more shots in a match.
Nottingham Forest look like a classic good version of a Nuno Espírito Santo team, similar to his Wolves sides that rattled off a run of top ten finishes. The attack has been juiced by some hot finishing, but this is a team that succeeds because they are exceptionally well-organized out of possession and can spring a few counter-attacks. It is difficult to collect enough points to finish in the top four or five with such a profile, because highly-defensive teams tend to achieve more draws and fewer wins.
Southampton have been one of the most tactically distinct teams in the data era, not just in the Premier League but across the big five leagues. I do not mean this as praise. But I am considering writing up a profile of this team because it is still underrated how strange it is that the worst team in the league plays such an aggressively possession-based style.
Methodology and To-Dos
Here’s how the method currently works. It strips out all minutes played not at even strength. It takes non-penalty expected goals (based on my xG model) and actual non-penalty goals in those minutes. The weight of xG to NPG begins at 90/10 and progresses to 50/50 over the full season. This is because there is real signal in goal-scoring, but it is swamped by variance in small samples. This aggregate creates an initial score for all teams.
To adjust for schedule, I use past-season data as well as current season data. The weight of current season to past season data increases every match until after the 25th match the past season data is no longer part of the team quality estimate.
Then I take the aggregate of opposition strength, weighted to 11v11 minutes, and adjusted for home field advantage. This creates the schedule strength metric. Team performance is multiplied by the schedule strength inflator (or deflator) to create axGD/Mt.
There is a lot here that could be improved. This is just a toy model. These are my own criticisms of this model.
The lack of other potentially-useful data. Proper power rankings would include performance outside the Premier League, and perhaps payroll and player quality factors.
The use of simple aggregates without regard for the game-state contexts of these numbers. To what degree is performance in game states where the outcome of a match is less in doubt as indicative of a team’s quality? To what degree do changes in play style when teams are leading or trailing artificially inflate or deflate production?
Throwing out minutes at uneven strength. Would it be better to include those minutes but with a deflator based on the change in expected performance? Here also a game-state adjustment would probably have interaction effects.
The lack of time-series adjustments in the weighting. Do performances in more recent matches better project future performance? Does performance in the second half of the preview season better project future performance than how a team played in the first half?
Penalties and other very high-xG chances. One of the reasons that goals is a less useful statistic than xG in small samples is simply that goals are all-or-nothing. Penalties are too. What about extremely high xG chances? A team creating a 0.90 xG chance has probably not played twice as well in attack as a team that created a 0.45 xG chance. Would these numbers be more predictive if very high xG opportunities were regressed somewhat?
The exact rate of decay in projections. How much should past performance be weighted against current performance over time, and how much should goals and xG be weighted against each other?
Finally, if I could build power rankings that are better than just a “toy”, it should be relatively easy to build a projection system and talk about how likely different teams are to win the league, get relegated, quality for the Champions League, and so on.
Any of these could be a proper Expecting Goals post on its own, and it’s work I am following up on.
Why Post This?
I am thinking about how to use different platforms to talk about stuff. All questions of politics aside, Twitter has become an utterly dead space for sharing ideas because it is impossible to direct people off-platform to any longer-format writing. I built my following through Twitter and never got into Instagram or YouTube or other platforms, and I find myself a bit stuck.
I am going to try out using Expecting Goals and the Substack platform for sharing my thoughts about football beyond the specific work of Expecting Goals studies. I look forward to seeing how you all feel about it and what direction this work can take.
Let me know in the comments if you have thoughts about the kind of posts we can add to Expecting Goals.
As I will discuss below, I hope to improve them and get some Expecting Goals content out of that process.
Thanks Michael! This was really informative. I like the way you breakdown the meaning and reasonings behind your positions. But also that you're still looking to improve the whole thing. Looking forward to reading more.