Friday, December 7, 2012

Dimitar Berbatov and small sample sizes

There's been some chatter these past couple of weeks about Dimitar Berbatov's perceived decline, with Bryan Ruiz's absence being suggested by a number of sources as the main factor. Fantasy Football Scout members can read their well reasoned and logical take on the issue here. This post is not a rebuke to that or any other post - indeed Ruiz's absence might be a legitimate reason for Berbatov's apparent decline - I just want to highlight the danger of cherry picking stats and using small sample sizes to make data fit with a convenient narrative.

Using just the last three games without Ruiz seems to be a touch imprecise as we have a number of other gameweeks from this very season where Berbatov played without his Costa Rican pal (GW3, 4 and 8, plus two games where Ruiz only came on a sub - GW5 and 9). Looking at Berbatov's data simply split between those minutes with Ruiz and those without gives a closer situation than the disaster some have suggested, but we still observe a not insignificant difference in Berbatov's underlying stats:

So while Berbatov has gotten as many touches without Ruiz in the side, he hasn't been able to generate as many shots, with almost a full shot inside the box and 0.6 shots on target less without the Costa Rican. Those totals aren't insignificant when you consider that around one in three of those SoT are converted into goals. 

The question I have though is one of causality. Correlation does not necessarily imply causation and thus we need to consider if Ruiz's absence was really the leading factor in Berbatov's 'struggles' these past three weeks. One of the metrics I use to determine team strength is how they performed against their opponents to date compared to others in the league. So if Arsenal are scoring an average of nine shots inside the box and Everton hold them to six, we'd say they overachieved by 50%. Averaging those +/- factors over the season gives us an overall rating with which to judge teams by. If we split Fulham's opponents to date between those who are above or below average, we get the below results for Berbatov:

The non-statistically swayed reader will now say "thanks for telling us that Berbatov is better against weaker teams" but yet this simpler narrative seems like at least part of the explanation for the Bulgarian's apparent decline. He's faced five 'hard' opponents this season, who have all performed between 11-26% better than league average, with three of those coming in the last three gameweeks. Stoke (-21%), Chelsea (-11%) and Tottenham (-26%) have all held opponents to at least 11% less shots than league average and thus Berbatov's chances have dried up somewhat. The next six gameweeks see Fulham face five sides who have either been average or worse this year in terms of allowing shots, with only the trip to Anfield in GW18 looking like a particular concern.

Could Berbatov be missing Ruiz? Possibly. Even probably. But I'd be cautious about taking a three-game, non opponent adjusted sample and trying to fit it into a single narrative to explain the variance in his underlying stats. If you own him, I would suggest it's hasty to sell now given the upcoming fixtures and I personally want to see how he fares against lesser opponents in Newcastle and QPR before pronouncing his demise as terminal.


JT said...

You make a great point about "cherry picking stats and using small sample sizes to make data fit with a convenient narrative."

It's easy to take some data and assume that the correlation implies causality and I thought the exact same thing when reading the FFS piece on Berbatov.

Good work Chris!

Mark Sutherns said...

To be fair our article is called "Monitor" rather than "Conclusive Proof - Sell Him Now" ;-)

The idea of the article, like all the monitor articles, is to spot and anticipate trends using smaller samples. We can wait 10 weeks to identify that Berbatov has declined but, in that time, his owners have drawn multiple blanks - we need to know now if there is a possible reason behind the decline. The reader can see that a small sample is used and decide for themselves how much stock to put in the patterns discussed.

There is a clear decline and there appears to be a shift in role for Berbatov without Ruiz and that does appear to hinder his output. As we say in the article, let's see if that pattern continues on Monday night - by doing so, we were advocating more data too, whilst alerting the reader to the viewpoint that they could be a reason behind the recent slump.

Chris Glover said...

Mark - You're right, and I hope this doesn't come across as a criticism of that piece. We actually draw the same conclusion, it's more the desperate reader comments I was addressing with the standard "must sell" calls. I do the exact same thing myself (looking at small samples) as like you say that's what is needed or else you'll be too late. Just thought it was interesting to see that while Ruiz's absence might be a factor, it also coincided with some really tough games. I wrote this late last night, so again, apologies if the tone comes across as preachy or critical.

stooshermadness said...

Chris - this is a really good post and I don't think you need to apologize. Like a lot of us holding Berbatov, I was waiting to see if his tough stretch of games against good defensive sides might be the key factor in his recent string of 2-pointers. Your analysis confirms that MAY be the case - and it MAY well be Ruiz's absence as the FFS article speculated.

Or it could be something as simple as "luck" if we want to call it that. There has always also been a "streaky" side to Berbatov (I had him in my FF team for long periods when he was with Tottenham but his goal scoring at MUN had a similar "non-pattern" if you will) for which you can hardly provide a statistical analysis as to when he might emerge from a poor string in FF terms. This is a guy, we should all remember, who has scored 5 goals in a single game twice in the last 4 years or so. It's not like his pattern is a fairly consistent metronomic 1 goal every 3 games, although over his career that apears abut right. At the end of the day, the goals/assists for Berbatov may just come down to when he is "feeling" it, the way a pro golfer suddenly starts canning every 15 footer he looks at. Those of us holding Berbatov probably need to trust that there will be a monster game coming at some point.