Thursday, October 24, 2013

Gameweek 9 Preview

Given how early in the season we are, the model is still liable to throw up the odd outlier and so in these weekly posts I plan to address those, shall we say, unexpected results. In future weeks, the plan is to post the data as soon as possible after the final games' data is up and then you can raise questions/issues during the week, to be addressed on either the following Thursday or Friday. For this week, I'll just try and guess where the questions might lie:

Keiren Westwood
Sunderland have conceded at least two goals in six straight contests, yet the model thinks they'll do okay this week. What gives? Well, having conceded 7.3 shots inside the box at home, they're hardly a team without hope (that alone would be the 9th best total of the teams playing this week). Add to that the fact that Newcastle have averaged 30% less SiB against their opponents than average, while only averaging 6.0 SiB on their travels, and you get a game where we're expecting Sunderland to only concede a handful of good chances (5.3 SiB) which gives them their best shot at a clean sheet to date (36% based on historic averages for team surrendering those shot totals).

Seamus Coleman over Leighton Baines
In reality this ranking is too close to pick between and essentially the model is saying they are equal. Baines is actually worth a little more because we know he has a steadier source of shots from set pieces, but those from free kicks are of course built into the model already, yet Coleman still comes out on top. Coleman has accounted for 6% of Everton's SiB compared to just 2% for Baines, hitting the target more frequently (50% vs 25%) too. Baines has a very slight edge (15% vs 13%) in the created chance department but to date these players have been very close and the 2.1m premium looks tough to justify.

Mezut Ozil
As much as Ozil has impressed to date, his current forecast of close to eight points looks aggressive compared to his peers who top out at just six. The only real explanation for this is small sample size and some of his somewhat fortunate conversion rates which aren't fully regressed in the weekly forecasts. The main culprit is that 88% SoT% which inflates his shot expectation for the week, even when adjusted for his historic average. Saying that, the data suggests Arsenal to top 13 SiB and 20 total shots, which is almost unprecedented and with Ozil being a central part in everything good about the Gunners to date, it's tough to argue against him being the top pick this week, even if the margin is probably a bit smaller than the model currently suggests.

Where are all the Tottenham players? Paulinho (3.7) and Soldado (3.5) are their best options yet find themselves way down the rankings in 16th and 12th places respectively. With a home fixture against Hull, most probably expect them to murder their opponents this week, yet the data suggests otherwise. First, Spurs are only averaging 6.8 SiB at home with a +/- of just 8%, both of which put them in the same league as Sunderland and Norwich rather than Arsenal and Chelsea. Second, Hull have actually been relatively good as suppressing shots away from home and while the results haven't come, they can be forgiven for shipping goals at Chelsea, City, Newcastle and Everton. McGregor's ranking shows that the model doesn't think Hull can necessarily go to White Hart Lane and keep Spurs at bay, but a thumping is not the forecast result either which limits the upside of Soldado and company this week (though he and the other Spurs stars remain solid starters). 


Anonymous said...


I have been wondering about your sharp division of home games and away games. An alternative way to do the model would be to assume a constant difference between home and away for all teams. This would put more certainty behind the forecasts, due to higher sample size. Of course, if there is large inter-team differences in homegame advantage, this doesn't hold, but I wonder if that is proven true, statistically.

Chris Glover said...

Hmm, yes, that's an interesting idea. My current split is definitely a problem as - as you say - it makes sample sizes smaller, perhaps needlessly. I have data for the last two seasons so perhaps I can look at creating a "premium/discount" based on home/away games for the league as a whole.

I would assume that some teams consistently struggle more in away games than others due to their playing style, though even that could just be based on biases we are told by the standard media coverage (i.e. Arsenal can't play at Stoke, Bolton etc).

Thanks for the idea. I'll post something if I find anything interesting or tweak the model in any way.

David Lythgoe said...

Think you'd benefit from this method:

Sqiar BI said...

SQIAR ( is a leading global consultancy which provides innovative business intelligence services to small and medium size (SMEs) businesses. Our agile approach provides organizations with breakthrough insights and powerful data visualizations to rapidly analyse multiple aspects of their business in perspectives that matter most.