A Sunday Kind of Piece: Sources of Error in Predicting the Future Wins in the NBA (Part 2)

Posted on 08/29/2010 by


“You came in with the breeze
On Sunday Morning
You sure have changed since yesterday
Without any warning

I thought I knew you
I thought I knew you
I thought I knew you well… so well
” –Sunday Morning by No Doubt

It’s  Sunday morning  and readers of this blog know what this could mean:

Let's get this out of the way, shall we.

No not double rainbows. It’s time for a Sunday paper style feature and my try at an in-depth statistical piece. As always, 100% satisfaction guaranteed or your money back 🙂 .

For this post. We’re going to re-visit and expand on some of the issues discussed in my piece on The Talent Pool and Marginal Value in the NBA . Particularly how player performance and talent level can affect future predictions of NBA team performance. This will be part 2 in a series (see here for part 1). The first part focused on the effect of the league’s talent level. Here we’ll talk about the individual players themselves.

Recap:The NBA Market & the Model

As we established in the previous piece, markets in are fickle,variable things. Values and Prices change and move based on an almost limitless and not easily quantifiable set of variables. For  sports, we have markets that are confined to a discrete set of parameters and results. We can measure points, wins, losses . We can make a  reasonably accurate  quantitative assessment of an NBA players’ value to his team and this is what Prof. Berri has done with his Wins Produced model.

It is with this model in hand that we can proceed to make predictions about the future performance of NBA teams. But before we do, I think it’s important that we understand the sources of variation in the model and their possible impact. To do this, let’s take a deeper look at our model.

How the model works is that it focuses on the Marginal Value of an NBA player vs others playing the same position for the particular set. A player is always compared to his peers and his value is measured based on the competitive advantage in wins he provides to his team. A quick way to summarize the model is:

Wins Produced for a Team = Sum for all players on the Team( ADJP48 * Minutes Played/48) – Avg.Team Productivity for the Year + 41

Where ADJP48 is  the player production of wins  per 48 minutes and the average team productivity is the measure of the average team productivity in wins for that particular year. In the Previous post on marginal value we established that:

  • The level of talent at a given position is not fixed over time but rather an ever evolving variable.
  • Average team productivity is a function of that ever evolving talent pool of peers competing against him.

There are four main sources of variation on a year to year basis in the Wins Produced model:

  • Variation in Player Productivity or how well your players play (ADJP48 Variation)
  • Variation in the Talent Pool or the overall quality of talent
  • Minute allocation or how good the coaching and injury luck is .
  • New players or the great unknown.

I’ve talked about the draft at length previously (and I will write more on it in the future) and I will leave minute allocation alone for now . The previous piece covered  changes in the overall talent pool (see here for part 1). We saw that the impact by position is generally within +/- 2 wins per team by position (and might be easily discounted as noise when looking at wins produced). However when we look at net impact by team by year:

We see that this becomes a significant factor. A 50 win team in 97 becomes a 61 win in 99 becomes a 50 win team again in 2000. Talent level can cause huge shifts in overall win numbers. So when using wins produced to project future performance for teams it is important to project the league as well or you could wind up missing the target.

This piece will focus on variations in individual player performance .

Variations in Player Performance: The Model

Before we look at what could cause an error in our projection of player performance let’s take a look at our model.  In another previous post, I touched on a basic model of NBA player performance over time. The model is based on the idea that NBA player productivity is mostly static over time with a predictable increase up to a peak age and then a periodic decay at the end of a career. Regression provides ample evidence that players ramp to a certain ADJP48 (raw productivity)  and  tend to remain within shouting distance of that number irregardless of position.

They are who we thought they were.

Here I’m going to provide a simple model with some detail . The analysis is  a little bit more complicated than just picking out a player’s  WP48 for the previous year. We start with age and ADJP48 (Raw Productivity).

Here’s a quick study I did based on the last ten years using just a straight up linear regression model (without accounting for injury or anything else)

Results for: Last 10 Years
Regression Analysis: ADJP48 versus Age, StdDev ADJP4, Max ADJP48
The regression equation is
ADJP48 = – 0.0631 + 0.00164 Age – 0.251 StdDev ADJP48 (1st 4 Y)
+ 0.850 Max ADJP48  (1st 4 Y)
S = 0.0673895   R-Sq = 69.5%   R-Sq(adj) = 69.4%

So Age and prior performance explain a significant amount of future player performance. Keep in mind that a lot of this error in player performance gets added out when looking at team performance (some players play better than expected and some worse) and that I’m not accounting for injuries.

Given that Age is more or less constant, let’s look at the magnitude of the error associated with player performance. For this I compared player performance as measured by ADJP48 vs the by minute weighted average. If we calculate the Standard deviation by position as a rolling average over five year periods we see the follwing:

Average variability is of the order of +/- .075 Win Produced per 48 minutes played. Center is currently the riskiest position with Shooting Guard being the least risky position. Shooting it seems is consistent across time, big man skills are more volatile over time.  To turn this into wins, I used 2000 MP played and 1 std deviation, which accounts for nearly 70% (68%) of the expected player population:

So for 2006 thru 2010, the error margin sits at +/- 3.3 wins for Centers at the high end and +/- 2.7 wins for Shooting Guards a the the low end. What can we conclude? The answer for player variability is depth. The data argues that depth at the Center slot is more important than at the other positions (yet another argument for the short supply of tall people and WORP) .

Our first piece concluded that  league wide talent can be significant factor on expected team wins. Here we see that lack of depth at the 5 makes you vulnerable to variation as a team. So for an NBA team, depth is critical to meet expectation in an 82 game regular season. As for the playoffs, as we’ve seen before the story is a little different there.

Posted in: Uncategorized