A good suggestion was made by Irishman to create a dedicated thread for the discussion of statistical methodology in football. The idea is to transfer as much of the discussion about statistical methodology to one thread so that it doesn’t clutter up other threads. Results of statistical analysis can still be presented in other threads, but discussion about statistical methodology is probably better off in its own thread. I’ll start off this thread by giving a simple intro into 4 commonly talked about methods: 1) z-scores 2) correlations 3) confidence intervals 4) hypothesis tests z-scores The purpose of z-scores is to take different sets of measurements and put them on the same scale. A common application of z-scores in football is to compare stats across eras. I’m not sure how best to explain z-scores (for me it would be best to just show the math.. one line and we’re done lol), but one way to think of z-scores is to ask how you would convert between two different scales like Celsius and Fahrenheit. First you need to know what the origin of the scale is (where zero lies): zero degrees Celsius corresponds to 32 degrees Fahrenheit. And then you need to know how the units of measurements compare: every 5 degrees increase in Celsius corresponds to a 9 degree increase in Fahrenheit. Same thing with z-scores. First you need to know what the origin corresponds to: a z-score of zero always corresponds to league average. Then you need to know the “unit of measurement”, which in this case is the standard deviation (a measure of how far the values go from league average). For those interested in how standard deviation is calculated: https://en.wikipedia.org/wiki/Standard_deviation An example of how to convert a stat like passer rating to z-scores. In 1970, the league average for team passer ratings was 66 so you automatically know 66 rating in 1970 equals z-score = 0. The standard deviation was 14.34 so one z-score unit in 1970 equals 14.34 passer rating points. If a QB had a rating of 85 in 1970 that’s (85 – 66)/14.34 = 1.325 z-scores above the mean. If it’s below the mean you just have a negative z-score. How to interpret z-scores? They tell you how “impressive” something was relative to league average, regardless of era or even type of measurement (you actually could compare z-scores of passing stats to z-scores of rushing stats). However, z-scores do not tell you whether an offense with a z-score of 1.325 in 1970 would perform with the same z-score in 2019. There’s no implication about transplanting someone from the past to the present or vice versa, just a measure of how “impressive” something is regardless of era. One final thing: you could use z-scores to report the "adjusted" rating in some target year, like a 1970's rating in 2019 numbers. Just calculate the z-score than translate to 2019 numbers, no different than going from Celsius to Fahrenheit. Correlation A correlation is a measure of how two sets of stats are related to each other. Without getting too technical I’d just interpret it as a measure of how much you can predict the value of one stat from the other stat. If you have a correlation of zero that means you have no ability to predict beyond random guessing. If however you have the maximum possible correlation of 1 or the minimum possible correlation of -1 that means that by knowing one stat you know exactly what happens to the other stat. The only difference between 1 and -1 is that a correlation of 1 means an increase in one stat implies an increase in the other stat, while a correlation of -1 means an increase in one stat implies a decrease in the other. Thus, correlations range from -1 to 1 and the closer the number is to either -1 or 1 the better you can predict one stat from the other. Two things to remember with correlations: 1) there is no implication of causality, and 2) there is no implied order in the two stats being compared, meaning that the predictive relationship is the same regardless of which stat you use to try to make the prediction. One more thing about correlations: the degree to which you can predict one stat from the other is actually not the correlation itself but the square of the correlation, which is called r-squared or r^2. So if someone reports a correlation of 0.5 it’s really the square of that number that is meaningful: 0.5^2 = 0.25 because that tells you the percent of variation in one stat you can explain by looking at the other stat (in this case 25% of the variation in one stat is known by looking at the other stat). Confidence intervals With almost everything in statistics there is something called a confidence interval, or CI, associated with it, and usually it’s a 95% CI. A 95% CI just specifies the range within which the true value of that statistic lies with 95% probability. The important thing about CI is that it is dependent on sample size, and it’s how you see the effect of sample size on a stat. To be clear, almost never do you see 95% CI reported in commonly available football stats. That’s because the standards are low. They really should report 95% CI with every stat so you can see how uncertain the estimates are. To give you an idea of what 95% CI looks like for Tom Brady (only QB I’ve calculated it for), after 1 game played in a season the 95% CI for Brady’s passer rating spans almost 140 passer rating points (70 above and 70 below whatever rating he got in that first game!), after 2 games it goes down to about 40 above and 40 below his passer rating after 2 games, and that 95% CI keeps decreasing as more games are played. In other words, as sample size increases the range within which the “true” passer rating lies keeps shrinking and you can have more confidence that the stat reflects "true" ability. CI is how sample size affects every single statistic, and I put it here not only because it’s important, but also because it makes explaining the next topic easy. Hypothesis testing You often hear that something is “statistically significant”. All that means is that something is too unlikely to have occurred by chance alone. Once you know the 95% CI (see previous section), all you have to do is ask whether the statistic you observed lies within the 95% CI or not. If yes, it is still too likely to have occurred by random variation alone. If no, then it's "statistically significant" and considered too unlikely to have occurred by chance alone. The choice of using a 95% CI rather than a 99% CI is arbitrary, but it’s the standard in almost every area of science. 95% CI corresponds to a 1 in 20 chance of the event occurring by random variation alone and that's generally unlikely enough in most contexts to say it's "statistically significant". I’ll note however that there are other contexts where the threshold is way higher. Best example is particle physics where the threshold might be at 1 in 3.5 million (5 standard deviations) before it’s statistically significant lol. OK.. maybe that will get things started.