I’m going to try science.
Words to live by. But trying science isn’t for the faint of heart and the thick skinned.Science is cool and dangerous.
Science is Epic but epics traditionally end badly for the participants. Truly trying science means asking the tough question and being willing to put yourself out there. It means not only being willing to risk being wrong but being assured that you will be wrong often. Because Science is as much about disproving something as it is about proving something.
Sometimes its about doing cool stuff too :-).
Why the long intro? What follows is a guest post from one of my favorite bloggers (Alex Konkel, Sport Skeptic). I both love and hate his articles. I love them for the enjoyment I get as a reader and hate them because of the professional jealousy they inspire. The best and most concise introduction I can give Alex is this: he’s a scientist and I hope he keeps blowing s$%# up for a good long time.
For science of course.
Hey everyone – Arturo asked if I would do a guest post talking about some of the stuff I’ve been doing over at my site, Sport Skeptic. Some of it has to do with fooling around with numbers and seeing what happens; sometimes it’s with actual data and sometimes it’s with made up data that is sort of like real data. Here’s a quick tour through some of the posts along with ideas that have come from the comments. Even when Arturo isn’t writing, we’re all about fanservice.
The second piece I wrote that got any attention was one on how a player metric, or rating system, would only be as reliable as the statistics that went into it. I made up player data for two seasons; the variables correlated across seasons as much as some of the actual NBA stats do. For example, field goal percentage is somewhat noisy or hard to predict from one year to the next while rebounding is more consistent. Then I made up a few metrics that put different weights on these variables. What I find is that the metrics that give relatively high weight to consistent variables are themselves consistent; in my example, the metric that gave a lot of value to shooting was noisy from year to year while the metric that gave more value to rebounding was more consistent. The upside of a consistent metric is that it allows you to have a better idea of how your players will perform next year.
A few good points came out of the comments on this one. For example, consistency alone doesn’t have to be a good thing; you could credit players with wins due to their height and it would be very consistent. That would lead to the next point, which is that consistency doesn’t matter if your model isn’t any good. I would make two points here; first, you obviously have to have a decent model. If you gave credit to players according to their height, you probably wouldn’t do a very good job predicting future wins. That might be a good way to check the quality of your model. Second, consistency is indeed nice – as long as everything else is equal. If your model is closer to the absolute truth and less consistent, it should be preferred over a less accurate but more consistent model. The questions are, how do you know which model is more accurate? and how consistent should we expect players to be? Those questions don’t have easy answers.
A little later I looked at what happens when you leave variables out of your model. This is an especially big problem when variables are correlated with each other, which is often true in sports. One of the strengths of regression is that if you have all the relevant information, you can figure out what weights all of your variables should get. But if you don’t, your weights can dance around like crazy; perhaps just as bad, the errors on those weights will definitely dance around and give you a mistaken impression of how important they are. This issue is a big reason why looking at simple correlations is typically a bad idea. A simple correlation is just a regression with one predictor. If you leave out all those other predictors the correlation can not only be inaccurate but simply wrong. Deciding what variables to use in a model falls under the umbrella of model selection, and there are rarely ‘right’ answers. Should you use true shooting percentage or effective field goal percentage and free throw percentage? If all you’re interested in is the effect of rebounding, should you include the kitchen sink of available variables? Two reasonable people could come to different conclusions, and a single person might use different models depending on what exactly their goal is. But in the complicated world of sports statistics, you probably wouldn’t be too wrong to always include as much information as possible.
I followed that up with a more thorough description of a model in the previous post. Here’s another comparison so you have something new. I used the player data I have, converted to per 48 minutes, and predicted WP48 from position (which essentially means we’re predicting adjusted P48), true shooting percentage, fouls, turnovers, blocks, steals, assists, defensive rebounds, and offensive rebounds, each scaled to normalized scores (not position, obviously, or WP48). I find that rebounds are the top predictors of WP48, followed by TS%, assists, fouls and turnovers (negative), steals, and blocks. It explains 97% of the variance (R squared is .97), meaning that these variables tell us virtually everything about why players vary in their WP48 values. I ran a similar model at the team level (excluding position since teams don’t have positions) predicting win percentage. TS% and turnovers are number one, followed by defensive rebounds, steals, blocks, offensive rebounds, and fouls and assists are actually not significant (I guess they don’t help teams win?). This model only explains about 67% of why teams win. What does this mean? I have no idea. Players and teams have different standard deviations for the same variable, so changing one standard deviation at the player level is not the same as changing one standard deviation at the team level. Also, the correlations between variables change; for example, teams with better TS% tend to have more assists, but TS% and assists are completely unrelated at the player level (whether or not you account for position). So you would probably expect the models to be different. This is relevant to a challenge I took from commenter Guy (who, I will repeat, won the challenge).
Finally, I had a couple run–ins with Phil Birnbaum. Most of this had to do with R squared and model interpretation. We decided that more is better; if possible, have more data. Even if you’re compiling across some variable, you should have as much of that variable as possible. For example, if you look at something across teams (such as win percentage, or salary, or something else), you’ll always have 30 points, one for each team. But you can make sure you have as many observations (such as games played) as possible per team before you run your analysis. This will make sure your estimates are as accurate as possible. But, looking at the R squared is key. It tells you how much of the variable of interest your model explains (just the way I described earlier). Sometimes, especially if you have a lot of data, you can have a significant variable that just doesn’t tell you that much about what you’re interested in. Salary and team wins is one example; salary is indeed significantly correlated with wins, but the R squared is only about .25 (depending on what season you look at). That lets you know that most (75%) of why teams differ in wins is *not* explained by salary. And everyone agrees that salary only explains wins because it stands in a bit for talent; GMs have some ability to pay better players more money, but it isn’t anywhere near perfect. Some of this has to do with the rules (like the rookie salary scale) and some of it has to do with player evaluation (thank you, Joe Dumars).
To pick another fight that will get me in lots of trouble, you could also look at an old post by Eli Witus looking at usage and efficiency. His regression coefficient tells him that a high-usage line-up gains in offensive efficiency beyond expected (that is, when you put five guys on the court who use a lot of possessions, they seem to have a better offensive rating than you would expect from that line-up). And the converse is true, showing that if you put low usage players together, their efficiency drops. The conclusion is taken to be that increasing your usage lowers your efficiency. However, the R squared for the models never gets above .04; he never explains more than 4% of efficiency differences across line-ups. That means you could predict a line-up’s efficiency nearly as well by saying they’re all average. A better conclusion would be that if someone put a gun to your head and made you guess you would say that there is a connection between usage and efficiency, but the evidence is pretty weak.
I always try to end my posts with a summary. I guess for this one I would say 1) statistics, properly used, are your friend; 2) even if you don’t have real-life data you can learn a lot about sports stats by making your own; and 3) always take your vitamins.