Once in a while, I like to take a look at incoming links for the site and go off to visit far away forums to see what they’re saying about what I write.
This can be a very rewarding and frustrating experience. Rewarding in that you get to see what the great zeitgeist of the internet spits out when you feed it an idea. Frustrating because well, it’s the internet.
As an old grizzled veteran of the boards, I have to fight my natural instinct to fight back, feed the trolls and contribute to kitten cruelty.
But a comment I read somewhere in the great wilds of the net has inspired a response. The gist of the comment was that the Wins Produced model (see here for detail) was as accurate as just using games won last season. Rather than give the typical internet reaction:
I decided to break out the excel. I was reminded of a post that was shared with me by a certain svengali like presence in my life that has me in his thrall. The piece is from advanced nfl statistics and in it they look at the predictions from Football Outsiders versus some very simple models (just predicting 8 wins or a basic regression to the mean model which is jokingly called Koko the Monkey). So I decided to do something similar.
Using data from 1979, my models to compare are:
- Wins from Previous Year
- Regression to the Mean: (Wins from LY -41)/2 +41 (or Bobo the Monkey)
- Wins Produced
If I look at the results in terms of regression it looks like this:
The difference is stark. If I look at the average error (RMSE) it looks like:
So 94% percent correlation versus 41% percent and an RMSE of 2 versus 8 wins. Bobo the monkey indeed.
A final note, this blog is now the proud sponsor of a Basketball Reference player page. A No-Prize (name your own post) to whoever guesses the Player on the comments. (I will be sans internet till tomorrow night so get cracking!)
Mark
10/07/2010
Can you at least say if it’s a current player?
arturogalletti
10/07/2010
It’s a player I’ve written about in the last 30 days (only clue before I turn off the laptop for 24 hours) 🙂
Chicago Tim
10/07/2010
Where are the sports blogs on that map?
Devin Dignam
10/07/2010
I’m trying to find it…is it up yet?
arturogalletti
10/08/2010
not yet (I picked it up yesterday)
a person
10/07/2010
Chris Paul
arturogalletti
10/08/2010
Good guess but no.
Austin
10/08/2010
No red on the first graph. Otherwise, *double thumbs up* as usual, excellent work.
Also a good source of funny images.
arturogalletti
10/08/2010
Stupid math tricks (the rsq for the two simple models came out identical).
nerdnumbers
10/08/2010
Devin,
Both Arturo and my sponsor pages are up (although I messed up the link on mine, it’s being fixed). See if you can find them 🙂 You can just go player by player, that’s less than 15K, easy right?
Devin Dignam
10/08/2010
So far no luck – I’ve tried the obvious Celtics and Nuggets players – and Charles Barkley for Arturo – but no hits. I’ve also tried some obvious players WP likes and hate (some “old friends” of WP).
Arturo makes it sound like his player may not be a current player, but then again, he may just be being sneaky.
ANY HINTS?!?!?!?
arturogalletti
10/08/2010
He’s an old wow friend and he’s currently in the league. That’s as much as you’re getting.
Devin Dignam
10/09/2010
Found Andres’ page – Andre Miller. Figures…should’ve checked that one earlier.
EntityAbyss
10/08/2010
Ok, I’m lost. I read this like 3 times and still have no idea what you are showing. I read the comments, but they’re not helping. What exactly is this saying?
arturogalletti
10/08/2010
That arguing that all WP is equivalent to saying last year’s win total is a strawman argument.
EntityAbyss
10/08/2010
yea, I got it after a while. I just got lost for a while. The graphs confused me. If you added the wins produced for a certain team, adjusted for the age thing that dberri uses, barring injuries, how close would it come to the actual result?
arturogalletti
10/08/2010
We will be finding out in a bit. That’s what I’ve been working on.
Mark
10/08/2010
Wait, so you’re using Wins Produced as prediction for the record of the team in the same year that the players produced those numbers? For example, the Wins Produced for a team in 2009-2010 to predict the teams record for 2009-2010? Of course that is going to have a strong correlation because it is equal to the correlation of the teams efficiency differential. That’s not much of a prediction.
I suspect the commenter you are responding to meant that you should do something like predict the 2010-2011 records using the Wins Produced from 2009-2010 of the players on the 2010-2011 roster. The commenter was then asserting that this method of prediction was no better than predicting from last year’s wins. Of course I’m just guessing this is what they meant.
I’m aware that such a prediction requires a good estimate of the minutes played by different players, which is a hard problem by itself. An interesting test of Wins Produced would be to use the WP48 (or raw Adj48) for each player from one season to predict the record (or efficiency differential) of the team the following season. I’ve seen this done for individual seasons for individual teams (usually in posts with topics like “The Pistons didn’t change any of their players so why did they expect to win for games”) but I’ve never seen a comprehensive analysis comparing this prediction method to others like “previous record”.
arturogalletti
10/08/2010
Mark,
This is actually what I’ve been building towards but there are a lot of pieces. ADJP48 projection, Rookies (performance and minutes), Euros and Minute allocation overall. I’m slowly unveiling pieces. I’ll get to the full build before the season starts (and the rosters are set).
The historical build is trickier because I have to automate it so I don’t accidentally rig it.
This was really meant as a silly, funny piece while I’m working on the models and predictions for next season.
Mark
10/09/2010
Arturo, thanks for the explanation. That makes sense. I recognize all the hard work you’re doing and it is interesting reading. Keep up the good work!
Mark
10/10/2010
Nick….. Fazekas
arturogalletti
10/10/2010
Winner, winner,chicken dinner! The No-Prize is Yours! Name your article.
Mark
10/10/2010
Thanks Arturo, I give credit to Google Reader because this post was literally the first thing it downloaded when I signed up, and any reason to scour basketball-reference while at work keeps my attention. I feel like a kid on Festivus morning.
I noticed while perusing the Automated Wins Produced site you linked to a few posts ago that last season Andrea Bargnani played 2799 minutes (30th in the league) while producing a whopping -3.38 wins. The Raptors would have been significantly better off giving these minutes to someone else (Amir Johnson, Reggie Evans and to a lesser extent Hedo all produced at PF, move Bosh to center fulltime and hope he doesn’t cry?), which got me thinking, how much does minute allocation affect overall wins for a team? Is there a way to rank the coaches based on how well they maximize their team’s win potential?
Thanks!
arturogalletti
10/10/2010
I did this for the playoffs (see here). I can do this for the regular season as well (I need to look at games played to do it to account for injuries). Is that officially your no-prize request?
Mark
10/10/2010
Oops must have missed these posts. I think it could be beneficial to do the regular season as it might give us a reason (other than horrible GMs) why some teams weren’t invited to the playoff party. Unless it’s too much analysis for not much reward, in which case I’d love a detailed post covering Sean Marks’ entire career 🙂
arturogalletti
10/10/2010
Mark,
I’ll take it up once I’m done with the rookie series and 76ers projection.