Thursday, November 7, 2013

Will some players always underachieve?

We had an interesting comment from a reader this week regarding the visualization posted plotting actual points with expected points. My proposition was that the players whose xP trailed their actual points by a distance were likely undervalued by the market, and while we wouldn't suggest they will somehow "make up" those points left on the table to date, we would expect their production to take an uptick assuming they continue to get chances and playing time at a relatively consistent rate. The reader had a different view:

"When I look at this chart I don't see underperformers or overperformers all I see is players on form who are capitalising on their chances (Ramsey and Rooney) and players who are of such quality that they will always out perform the normal (Aguero and Yaya) . . . I believe that if you reconstructed this table after xmas with a start date of tomorrow then the same players would occupy the two sides."

It's a fair proposition and one I wanted to examine further. I think there's a general discomfort with the idea of regressing players' production to the mean as it seems to suggest they are all created equally. A couple of responses to that:
  1. For conversion rates which appear to be repeatable year on year, such as shot on target percentage (SoT%), we regress players to their own historical rates (where available). This means that if we say Olivier Giroud has an unsustainable SoT%, we're not saying his is too high compared with Danny Graham or Frazier Campbell, we're saying it's way above his own historic rate.
  2. For conversion rates where we do regress to a league average (or at least use league average in a weighted average), it's because I haven't seen any evidence that players can consistently perform above the average in that given rate. The classic example is goals per shot on target (G/SoT) which tends to regress close to a mean for most players, with only a couple exceeding the average for more than a couple of years in a row (and that would be expected even if we were talking about a totally random event). There might be some repeatability there, but it's a lot less than most would expect based on purely on notions like "form", "class" or being "clinical".
The good news is that this is fairly easy to test. Below we've plotted players' +/- score as of this week (which shows the difference between their actual and expected points with a positive score meaning their expected exceeds their actual) against the same metric from the midway point of last season. I picked that point in time based on the reader comment about Christmas but I'm fairly confident a similar conclusion could be drawn from pretty much any two comparable samples:

The first observation is that we see very little correlation from year to year. 10 players outperformed their points total last season by at least 10 points, yet only one of these (Podolski) has managed to outperform his total to date by even 5 points. Similarly, eight players underperformed their underlying stats by 10 or more points last season, and of these two (Lambert and Cisse) have once again failed to match their live up to expectations. On the flip side we've seen players like Aguero, Rooney, Lallana, Suarez, Michu, Walters and Fellaini benefit or suffer from huge reversals in fortune over the two samples.

One of the things I love about sports writing is that it can be a gateway into so many interesting subjects, and while I'm not learned enough to talk about most of them here, I would venture that there is an element of bias regarding the way we judge the above. When a player like Ramsey explodes in a small sample, we tend to quickly absorb that information into our collective psyche and it becomes the new self evident truth that he is a great player (despite several seasons of reasonable yet unspectacular play, at least from a fantasy perspective). We then place too much weight on these recent events, much like how people stop swimming after a shark attack, despite the fact there are countless things more likely to really kill them that they ignore every day. I believe the term for this specific type of bias is referred to as the availability heuristic.

In the chart we see Aguero has the second highest +/- score for 2013 and one could rationalise that being due to his superior skill and quality teammates. Indeed, that's possibly true to a point. However, he had those very same skills and most of the teammates last year too, yet was actually one of the biggest underperformers last year, serving as a constant source of frustration for his owners. Or take van Persie. Last year he ascended to a new level and was casually thrown into conversations alongside the best in the world, and thus the fact he outperformed his xP by a full 13 points through half a year could be discounted as him simply being better than everyone else. Fast forward 10 months and we have a player who has only just caught up to his xP total for the year, having suffered through some bad luck these past couple of months.

As a final check, the colour coding relates to the players' team's league position ranging from 1st (green) to last (red). I wondered if we'd tend to see players from the better teams show an ability to repeat positive seasons as they benefit from more quality chances per game. I guess this works to a degree in that those in the bottom left quadrant generally play for better teams, yet there's not enough here to really draw any solid conclusions.

It's always good to challenge forecasts like the ones you find in these pages - especially the ones found in these pages! - but caution should also be exercised when dismissing data which contradicts our current view of the game. There are certainly aspects of a player's game which can consistently be above average (SoT% for one) but others seem far less repeatable. The current iteration of the model adjusts for these differences and thus that's why we're going to see turnover in the players who over or underachieve expectations. 

4 comments:

CDI said...

Thanks alot for your posts Chris. I'm one of the few who has fallen behind as I refused to believe Yaya and Ram could sustain their scoring levels due to my own views on their games. To date they have proven me wrong and I'm paying for it but the stats show that my views were not totally off base and varience is just working against me right now.

The main issue I'm having now is if I should jump on the train while its still running. Poker has taught me the the long run can be RELLLLLLY long which can allow players like Ram to run hot for a whole season or even longer in terms of SOT/G ratios. If you didn't have Ram in your team would you buy him today at 7.2?

Tony S said...

Hi Chris,
Nice post, I had a feeling that the Michu 'last season vs this season' would pop up at some point in this debate 
I believe that, although players stats may level out over 2 years during them two years they will go on hot streaks and slumps. And I believe that following the overachievers during these hot streaks will pay dividends and chasing the points from underachievers will never pay off.
Everything Michu hit last year went into the top corner and everything he hits this year goes to the goalkeeper. So just avoid him ..... equally last year Suarez and Aguero seemed ‘unlucky’ at times but so far this year they seem to be getting the breaks .......... so back them all the way.
Of course the question is ‘when is the hot streak over?’ and ‘when is a slump finished?’. This is the tricky bit because I don’t think it can be detected (quickly) with statistics, I think this is where watching football matches is imperative. Stats can tell you a player is running hot but I don’t think it can tell you it is over until well after the fact. If Ramsey has 4 shots against Utd and 3 of them hit the post ..... then its over 

I suppose my point is, that underachieving and overachieving is temporary, it is decided by a players form and confidence and when you see a hot streak jump on it ...... and only leave it for another hot streak but NEVER select an underachiever until he has proved that his slump is over.

Of course this is just my opinion, but thats what Fantasy Football is all about 

Cheers
Tony

2ndMan said...

This is a brilliant post Chris, exactly the sort of discussion which makes this blog brilliant.

It's encouraging to see no relationship here, and if there is it seems to be a negative one which would support the idea that players will regress.

I think bias is the main reason there's still huge objection to analytics in football, it's far easier to rationalise players being in good/bad form/confidence than it is to say players perform fairly consistently but experience random variation in their results. The hot hand fallacy in basketball was dispelled decades ago but coaches and commentators still talk about giving hot players the ball, and players still want to take heat checks.

What I'd love to see built into these models to test whether some can consistently over-perform is player wages as a proxy for player talent. But good luck getting hold of reliable data on that!

Gummi said...

Great post. This give us current Fantasy underachievers confidence to keep playing the numbers games instead of chasing points.

@CDI: I'm in the same position and you and I'm not going to chase Ramsey and Toure. That would be making a logical choice based on variation, as I see it. If it works out, it would be luck.