Saturday, December 22, 2012

Model Review: A Comparison

After posting a review of the latest model, reader Agnar had the excellent suggestion of not just comparing the model to actual results, but benchmarking it against another simpler forecast system, likely employed by large sections of fantasy managers. As suggested, a simple and useful method here is to simply take the points per game (actually points per 90 minutes to avoid the odd sub appearance skewing things) accumulated in gameweeks 1-7 and then comparing that rate to the actual rate delivered in gameweeks 8-17. We can then plot that side by side with the model analysis and see (a) which is better, and (b) if there are any specific areas in which the model succeeds/fails.

The first graph shows the P90 for gameweeks 1-7 plotted against the P90 for gameweeks 8-17 (filtered to only include players who racked up 600+ minutes). The second graph shows the forecast points for gameweeks 8-17 from the model plotted against the actual P90 for gameweeks 8-17 (again, limited to players with 600+ minutes).

I'm pleased to say that the model looks quite a lot stronger than simply looking at historic points to date, and while that's hardly proof that the model is somehow a perfect forecasting tool (it very much isn't), it's good to know we're improving on the status quo method used by many a manager across the world (i.e. points chasing). A couple of specifics on the two charts:

Model
P90 Historic
Correlation
66%
53%
r-squared
0.50
0.30
Standard deviation
0.7
1.1
Players within one P90
67%
52%

The big outliers from the model forecast are those who are generally considered 'elite' or at least 'in form' (whatever that means). While this is a problem in that they're the players we obviously need to target for our respective fantasy teams, it's encouraging that a certain type of player are being mis-evaluated by the model, as this gives us an opportunity to tweak things and forecast them better (in some cases it probably isn't mis-evaluation and simply a case of unforecastable luck). The historic forecast on the other hand sees all kinds of players with bad forecasts, which can be explained by either initially high conversion rates, easier fixtures or a change in playing time/role.
I'd hope that anyone looking at this data regularly was confident it was providing them with something stronger than simply looking at the points table, but at least this analysis tries to quantify that advantage. With 66% correlation and only a 0.5 r-squared, we're far from announcing the model as a total success just yet, but given that the above is based on small samples there's reason to believe these rates will improve as we get more data (comparing the model on a pure total points basis rather than per 90 minutes rate also improves things, with a correlation of 72% for all players excluding games below 45 minutes).

4 comments:

Ben Kirwan said...
This comment has been removed by the author.
Anthony said...

Thanks so much Chris for writing such a great blog! Always entertaining to catch up on your latest statistical endeavors!

Got a question I'd like to ask you. How do you rate Pienaar? He's been getting some returns lately and I'm thinking of buying him for the Wigan game as a short term buy. Has he impressed you going by stats at all? It might be a good time to own him with Fellaini out of the equation temporarily.

Oh and commiserations on Aguero (c), I had the same and was hoping for a stronger performance from City. Very disappointing to say the least!

Oliver Alexander said...

Hey Chris, firstly just want to say a big thanks for creating this blog! I think you've done great with the model so far and I think its going to continue to improve as the season progresses.

There are a couple of things I would like to hear your opinion on as this part of the season is so vital and I'm a bit behind because of starting in gameweek 3. I've been doing fairly well recently and I'm now mid table in my mini league but I think I need some more differentials. Like you I went for Aguero (c) and was gutted! I think Aguero is going to come good sometime soon but I think my 11 mil could be better spent. What do you think of the idea of doing Aguero>Benteke to allow Demel>Luiz and Sess> elite midfielder? Villa's fixtures are good and I think Benteke is a real talent. Luiz is expensive but his move to midfield, if it is permanent, makes him really tempting imo. Chelsea are also looking good at the back. Baring in mind that I need to make ground do you think these transers are a good idea? I would plan to do the first 2 before GW19 (I don't think I can field 11 otherwise) and the midfield transfer the week after.

Thanks in advance!

Tony S said...

Hi Chris,
Nice Blog, its good to see that model is improving but I was just wondering if removing the uncertainty of the bonus points would improve the model? Because as it stands a goal in a Stoke game is worth 3 extra points whereas a goal in a Reading game isn't necessarily and its hard to capture this into any model.
So if the calculation model was designed to give point projections excluding bonus points then our correlation to the actual P90 might get a more accurate.
Predicting the allocation of bonus points can be addressed another time.

Cheers
Tony