Age & Productivity Model for the NBA Revisited

Posted on 08/12/2010 by


I put up a post earlier today on the 4 and a half team trade that took place today. Reader Tom Mandel asks:

I agree with your analysis — hard not to. Also, as usual, your framing is witty and entertaining. I love that is still available!

Yet, can you tell me — methodologically — why it’s preferable to use Ariza’s ’09-10 WP48 rather than his much better ’08-9 WP48 in analyzing his value in the trade?

Obviously, it’s a more recent number — but is that fact a defensible methodological justification? After all, the previous two years provide more minutes, i.e. a better dataset.

Am I wrong to say that the value of your analysis is virtually *entirely* in the data you use? And that the rest is not much more than arithmetic?

Hence, how much confidence do you have that the data you use is the right data to assess this trade? And what justifies that confidence?

I was going to respond to this in the comments but once I realized I’d put in regression analysis and graphs I decided  that we were all better served by having it as a post on its own.

The analysis is  a little bit more complicated than just picking out his last years WP48. I’m going to start with age and ADJP48 (Raw Productivity). I’d published a post on the age model before and we will revisit some of those topics here(covered in detail here)

There’s a lot of evidence (i.e regression work) that players ramp to a certain ADJP48 (raw productivity)  and  tend to remain within shouting distance of that number irregardless of position. Here’s a quick study I did based on the last ten years using just a straight up linear regression model:
Results for: Last 10 Years
Regression Analysis: ADJP48 versus Age, StdDev ADJP4, Max ADJP48
The regression equation is
ADJP48 = – 0.0631 + 0.00164 Age – 0.251 StdDev ADJP48 (1st 4 Y)
+ 0.850 Max ADJP48  (1st 4 Y)
S = 0.0673895   R-Sq = 69.5%   R-Sq(adj) = 69.4%

So Age and prior performance explain a significant amount of future player performance. Now the Age curve  looks something like this (looking at players born since 1970):

And graphically like this:

You see  a gradual increase up to about 24 and then players remain at a consistent level across until they hit a gradual decline at 30.

What does it all mean?

You Knew I couldn't resist

If I want to build a simple model of future player performance I work under the following assumptions:

  • Players ADJP48 will remain consistent over time with noise on a year to year basis
  • Age is important towards the end and beginning of a career
  • Minute Allocation remains consistent over time

So with that I take the average minutes played for the last three years for each player (to filter out year to year noise), the weighted average of ADJP48 and the position adjust for the last year and project a win model. The model is good enough for a quick take but not as good as something like I would do for a season projection. The weaknesses here are:

  • Old or Young players (Collinson and Posey come to mind)
  • Minute Allocation by coaches.

If I were a GM, I’d have a stat department and I would build a more robust model for this kind of transaction. I also would not be doing quips on the internets (at least not under my real name).

Posted in: Uncategorized