You can lay the blame for inspiring this post squarely on the shoulders of Nate Silver, Dave Berri and Andres Alvarez.
My muse can be very fickle. Some posts I write in minutes, some take hours or days. Some never get written. It’s all about feeling inspired and having something to say. This particular post has been in my head for months. I’ve written drafts. I’ve done hours and hours of exhaustive research on this. I’ve built models, I’ve bought software to confirm and verify my findings. I’ve run my findings past other people to confirm I am not seeing things.
My point is this: I did the work so now I feel fully justified in breaking out my pimp hand. You may not like it but as always you can go off and confirm what I find. Please remember that I am merely an agent for science.
Let’s talk about Adjusted Plus/Minus.
But let’s talk about the scientific method first. According to the Wikipedia article, the scientific method is:
“The Oxford English Dictionary says that scientific method is: “a method of procedure that has characterized natural science since the 17th century, consisting in systematic observation, measurement, and experiment, and the formulation, testing, and modification of hypotheses……
In the 20th century, a hypothetico-deductive model for scientific method was formulated (for a more formal discussion, see below):
- 1. Use your experience: Consider the problem and try to make sense of it. Look for previous explanations. If this is a new problem to you, then move to step 2.
- 2. Form a conjecture: When nothing else is yet known, try to state an explanation, to someone else, or to your notebook.
- 3. Deduce a prediction from that explanation: If you assume 2 is true, what consequences follow?
- 4. Test: Look for the opposite of each consequence in order to disprove 2. It is a logical error to seek 3 directly as proof of 2. This error is called affirming the consequent.“
Nobel Laureate Richard Feynman sums it up here . For him,the Key to science is you make a guess at a truth or law, you calculate the consequences of that guess and then you test it against nature. If it disagrees with experiment your guess is wrong. This is the crux of the matter.
Longtime readers know that this is not my first time taking a model apart. In fact it’s something of a habit. I love testing and deconstructing NBA performance models. Some examples are:
A lot of it has to do with my passion to explore and test what’s out there to see what I can adapt and learn to use in my own quest to build a better model. But lately, I had been short of time. As Is said, the genesis and inspiration for this post comes from Nate Silver positing the existence of a Melo effect and Andres putting up a game splits tool. Both made me feel like I had lost a step. I’d stopped advancing towards my desired goal of a better model.
I decided to start looking for play by play data sources (and this is a story for future time trust me). One of the things I focused on was the logistics behind Adjusted Plus Minus.
Now I had my reservations. The main one being this. Prof. Berri has more than once kindly shared some of his work with me. One forthcoming article ( Berri, David J. “Measuring Performance in the National Basketball Association.” In The Handbook of Sports Economics, eds. Stephen Shmanske and Leo Kahane; Oxford University Press) has the following neat little table:
This table shows all the most commonly used models of player performance in the NBA (See here for full on explanations) and it shows two key things:
- Explanatory Power to Wins (think experimental correlation to reality)
- Consistency of Player performance year to year (Does the model tell us something of use about the player over time)
You’ll note that the Scoring models are very consistent over time but don’t really measure success. Plus-Minus models have high explanatory power to wins but lower consistency (with APM being the worst). WS and WP do decent jobs at both with Wins Produced having a better consistency at the player level over time.
That lack of consistency for APM was a worrisome little tidbit that I filed away for future reference.
But as always, I set my doubts aside and decided to go off and build it myself.
My first step was to go to the most often quoted source for APM, BasketballValue.com. At this point, I want to make clear that the model I will be discussing is based on and I quote BasketballValue.com: ” The adjusted +/- calculations are in the spirit of the work of Dan Rosenbaum“. There are other regression based +/- models there that use different methodologies with varying degrees of success and I make no claim as to their validity. I may in fact, have some ideas around this myself ( but again future Arturo will cover this at some point in the near future).
So I read the source material and set about building +/- and giving it my own little twist. I took all the splits >2minutes and set it up as folows:
Home Margin = b0 + a1H1+a1H2+……+ b1X1 + b2X2 + . . . + bKXK + e, where
H are the different homecourt scenarios (which I’ve talked about before at length) and X are player minutes. I do +mplayed for home players,-mplayed for road players and 0 for not played.
I regressed everything out for 2009 and I got the following:
Players and Homecourt broken down and that makes for a neat little post right?
Except I like to double check my numbers and two things jumped out. One the correlation to wins was very low (~10% R^2) and the +/- numbers don’t quite add at the team level. Somehow they do add up in the final +/- APM numbers.
This officially ticked off some warning bells. Something was funny.
So I decided to go off and do some more experimenting.
I decided to step thru the work from here: http://www.82games.com/comm30.htm (the Dan Rosenbaum piece which is the basis claimed for APM). The article describes a 3 step process which breaks down as follows:
Step 1: Regress Point Margin per possesion to the Players on Court vs reference players. Take all players >250 Minutes Played for 2002-3,2003-4 . Weigh by possessions, year, and game situation(There’s a whole algorithm for this which you can look up in the piece if you want and I had to spend a few hours building in excel) .Regress. He calls this True +/-.
I ran this regression multiple times. I ran it with the data set from the article. I went out and found the initial thread on APBR (http://sonicscentral.com/apbrmetrics/viewtopic.php?t=327), I also went and downloaded the data set generated here (by someone who is not me just to make sure I wasn’t screwing it up): http://www.countthebasket.com/blog/2008/06/01/calculating-adjusted-plus-minus/
Every single regression gave me less that 5% R-Sq. So I feel confident in the statement that the correlation of the model in step 1 (as described) is <5%.
The question then becomes how we get to 95% overall Correlation to wins? Let’s talk about the other steps.
The model now takes the True +/- values for each player from the first equation and regresses those against those player’s stats to determine weights for each stat. He reports an R^2 of 44%. It’s important to note that this is the R^2 between the True +/- for the 420 players and the Stats and not the stats and Point Margin per possession or Wins.
After, the next bit is to take the Weight of each stat in point margin and uses it to calculate each players statistical +/- (think a version of Win Score or NBA Efficiency with a supremely slim correlation to wins and point margin).
Confused yet? Wait there’s more. Now comes the really sketchy bit
Step 3 :Calculating Adjusted +/-
The final step is to take the Pure regression and the Stats model and adds them up by player like so:
APM = x* Pure +/- + (1-x)*Statistical +/-
And proceed to adjust x between 10% and 90% for each player to minimize the error. In essence he tweaks the rating to get a high R-Square.
To summarize, the APM model calculates two variables with a low correlation to wins (R^2 <5%) and adds them up to minimize the error and guarantee a 90%+ Rsq. for the overall model.
What does this mean exactly? Well, the R^2 for the APM model is very much a fabrication. The correlation to point margin & wins of the model shown in Basketball value is artificially inflated by adding the error back in. To put this in perspective, I would bet a simple model using minutes played % for a team to assign wins to substitute true +/- and Wins Produced or Win Shares for Statistical +/- would be much more consistent to team wins prior to the error correction (and produce more consistent results and I may in fact have another post for Future Arturo).
I went into this exercise hoping to find something that would make my life easier. If it worked, I could use it to derive and infer all sorts of cool stuff about opponents and defense. Sadly, the APM model examined does not hold up under scrutiny. It is built to account for all the variability in the process but hold very little actual correlation to the actual process.
In brief, I failed to simplify my life (with APM although I looked at some other variations of this kind of model as listed on the table that had some interesting potential ) and I probably just exponentially increased the list of people who have less than kind thoughts towards me. At the end of the day however it’s all about science and science is a cruel mistress.
Regardless, let me get you started:
Because we are all about the fanservice :-)
PS. I received a very gracious note from Aaron Barzilai, Ph.D from BasketballValue.com. It follows in full:
A couple people just pointed me to your post, so I wanted to clarify what I’m providing on basketballvalue.com. I think there might be a little confusion.
Certainly, as you quoted from my web page “The adjusted +/- calculations are in the spirit of the work of Dan Rosenbaum”. On basketballvalue.com, I report both unadjusted numbers (e.g. Overall Rating) and adjusted numbers (e.g. 1 Year Adj. +/-). These adjusted numbers are the result of the basic regression as outlined in Rosenbaum’s article that you link to, labeled formula (1). In that paper, he lists the results in Table 1 as “Pure Adjusted Plus/Minus Ratings”.
The rest of your post above gets into the technique that Dan introduces for Statistical Plus/Minus ratings and then Overall Plus/Minus ratings. While that’s interesting work, that’s not what is presented on basketballvalue.com. I tried to make that clear in the various explanations that have been posted, including the 82games articles referenced in the comments(“These ratings have been determined using only the matchup data available at basketballvalue.com. They do not explicitly include box score statistics”). Sorry if that still comes across as a little opaque.
Also, I just thought I should mention that the purpose of the site is really just to be a service to the community. The raw data is available for download so that others can use it in their work, which I find really rewarding. The unadjusted and adjusted +/- results are also updated on an almost daily basis so that people who are interested but might not be able to reproduce them on their own can see them. I don’t include commentary, and I think my position is summed up best in the conclusion of http://www.82games.com/barzilai2.htm :
“the results must be one piece of a broader assessment of the player.”
I very much appreciate thoughtful response as well as the data services provided.
The numbers reported on your site can correctly be said to more closely resemble step 1 of the Rosenbaum paper (with an additional step to center the data). This is the initial step that I worked out to about a 5% R^2 .
I do totally agree with the statement :“the results must be one piece of a broader assessment of the player.” and I think I stated as much earlier.
You just know I’m coming back to this in the future :-)