A short tutorial on Cox regression models

Cox regression is a proportional hazards model used to model ‘time to event’ data. It’s so named as a change in one of the inputs to the model results in a multiplicative effect on the hazard rate.

Read More

Predicting football matches with Random Forests and Platt scaling

Random forests have been shown to be an incredibly powerful means of classification. However for a problem such as football match result prediction, we’re less interested in classification and moreso in the uncertainties involved. The following post gives details on the random forest model I have been working on of late, and how Platt scaling corrects the uncalibrated probabilities calculated by the classifier. For this model I’ve used fivethirtyeight’s football dataset, specifically for the English premier league. I’ve combined this with other match data found on football-data.co.uk.

Read More

How much have Leeds United improved?

Much has been made of Leeds’ great start to the season, but as supporters know, good runs soon came to an end under Garry Monk and in particular Thomas Christiansen last season. What can the stats tell us about this season so far, and are Leeds genuine contenders for promotion this time around?

Read More

Do idle keepers lose concentration?

A much used trope on the co-comms circuit is the congratulatory words for goalkeepers who make a save having stood idle for a certain period. Watching England’s pre-World Cup friendly with Costa Rica, Glen Hoddle could be heard trotting out this line after Jack Butland faced, and saved, his first shot.

Read More

What makes a good goal scorer?

The summer transfer window is underway, with the usual glut of clubs looking to improve on the forward options already in their squad. This includes my own Leeds, who are involved in a quite public pursuit of Abel Hernandez. formerly of Hull City. Searching through fan forums, the inevitable comparisons to Chris Wood are there to be seen, as well as the usual discussion of whether the considerable outlay on Hernandez will be worth it.

Read More

Soccermatics odds-bias strategy in Irish league football

In his book, Soccermatics, Prof. David Sumpter uses a betting strategy which he calls his ‘odds-bias’ strategy. This involves looking at the implied probability of bookmakers’ odds, relative to the observed frequency with which these events occur. In the book, biases in the odds of strong home favourites and well matched draws are identified in the English Premier League, and a profit is made. The question is, can this be applied to other leagues, and can it be refined?

Read More

Can we predict future xG using the gamma distribution?

This post was written (and this blog was started) with the intention of explaining a model I’ve been working on to predict future expected goals, or xG in Premier League football matches. If this sounds a bit alien, this, is an excellent primer on expected goals, and the intuition behind it all.

Read More