A short tutorial on Cox regression models
Cox regression is a proportional hazards model used to model ‘time to event’ data. It’s so named as a change in one of the inputs to the model results in a multiplicative effect on the hazard rate.
Statistics - usually for football
Cox regression is a proportional hazards model used to model ‘time to event’ data. It’s so named as a change in one of the inputs to the model results in a multiplicative effect on the hazard rate.
Random forests have been shown to be an incredibly powerful means of classification. However for a problem such as football match result prediction, we’re less interested in classification and moreso in the uncertainties involved. The following post gives details on the random forest model I have been working on of late, and how Platt scaling corrects the uncalibrated probabilities calculated by the classifier. For this model I’ve used fivethirtyeight’s football dataset, specifically for the English premier league. I’ve combined this with other match data found on football-data.co.uk.
Much has been made of Leeds’ great start to the season, but as supporters know, good runs soon came to an end under Garry Monk and in particular Thomas Christiansen last season. What can the stats tell us about this season so far, and are Leeds genuine contenders for promotion this time around?
A much used trope on the co-comms circuit is the congratulatory words for goalkeepers who make a save having stood idle for a certain period. Watching England’s pre-World Cup friendly with Costa Rica, Glen Hoddle could be heard trotting out this line after Jack Butland faced, and saved, his first shot.
The summer transfer window is underway, with the usual glut of clubs looking to improve on the forward options already in their squad. This includes my own Leeds, who are involved in a quite public pursuit of Abel Hernandez. formerly of Hull City. Searching through fan forums, the inevitable comparisons to Chris Wood are there to be seen, as well as the usual discussion of whether the considerable outlay on Hernandez will be worth it.
In his book, Soccermatics, Prof. David Sumpter uses a betting strategy which he calls his ‘odds-bias’ strategy. This involves looking at the implied probability of bookmakers’ odds, relative to the observed frequency with which these events occur. In the book, biases in the odds of strong home favourites and well matched draws are identified in the English Premier League, and a profit is made. The question is, can this be applied to other leagues, and can it be refined?
West Ham United vs Liverpool, 5:30pm – London Stadium
This post was written (and this blog was started) with the intention of explaining a model I’ve been working on to predict future expected goals, or xG in Premier League football matches. If this sounds a bit alien, this, is an excellent primer on expected goals, and the intuition behind it all.