Prove me wrong, Rook (Part 2)

Posted on 10/10/2010 by

Readers familiar to this blog know that I’ve been working on a model to predict success in the NBA using the Wins Produced metric (See the Basics here and the development version here).  This is one of the goals of this blog,  it’s mission statement. Build the model, test it out, get feedback,  improve it, test again and so forth in a iterative loop. The final product for this offseason is the full predictions for every team but there are a few things pending before we get there.

Yesterday, I unveiled one of those, my rookie model and it was deemed awesome.

Again, Frank Quietly = Awesome

Today we continue with the awesomeness as we review/revise the model, look at how the 2009 draft class fared and crunch the 2010 class. And now on with the show.

I looked at all the data (thank you  Draft Express ) and found the following variables that correlate in  a meaningful way:

• Win Score per 40 minutes (Can you Play?)
• Height (Are You Tall?)
• Age when drafted (Are you Young?)
• Position (What Position?)

The initial model I came up with yesterday looked as follows:

ADJP48 = K – A* HEIGHT + B* SIMPOS – C* DFTAGE + D* WS40

Were K,A,B,C,D are constant

With a correlation of 42% for every player that played more than 400 minutes as rookies coming from college (from 1996 to 2010 that’s 373 players).

I built a second tweaked version using categorical variables (for Age and Height) based on reader’s Alex’s suggestion. This one had a correlation of 45% for every player that played more than 400 minutes as rookies coming from college (from 1996 to 2010 that’s 373 players). We’ll call these models: Yogi and Booboo.

This is a totally legit picture

So I went ahead and did some data analysis and added some simple logic based on the predicted WP48 for  each player and the hit rate (% of players who were .090 WP48 over their 1st Four Years). The graph for Yogi follows:

For Yogi, I selected .095 WP48 predicted as the cutoff point and for Boo Boo I went with .067 WP48. The Full modified table is here.  To give me an idea of the value of the model I decided to look at:

• The probability of landing a better than average player (>.090 WP48) for his first four seasons
• The probability of landing a good player (>.150 WP48) for his first four seasons

I also decided to show this for:

• Any qualifying pick (>400 MP in his rookie Year)
• Any Top 5 pick
• Any Top 10 pick
• Any 1st Round Pick
• And Both models.

And this was done for 1995 to 2009. The table is here:

The best performing scenario is both models calling for you to draft the player, followed by Yogi then Boo Boo than having the Top five picks. Yogi is more picky, Boo Boo casts a broader net and is more accurate.  The best illustration I can give for their effectiveness is setting them loose on the 2009 rookies :

Early returns, show that Model # 2 did a fabuluos job picking winners and calling the ADJP48 (61% Correlation).

Now all that’s left is to throw it at the 2010 rookies and combine it with the rookie minute model (Cindy) and we are done!

Tune in Tomorrow, Same Bat Time, Same Bat Channel!