March 17, 2020
STATISTICS
Null Hypothesis is a prediction that there is no significant difference.
P-Values give us an idea of how confident we can be in a result.
Generally, the more samples we collect, the smaller a difference we will be able to detect.
The more T-tests we perform, the more likely we are to get a false positive, a Type 1 error.
ANOVA, Analysis of Variance, tests the null hypothesis that all of the data sets have the same mean. If we reject the null hypothesis with ANOVA, we’re saying that at least one of the sets has a different mean; however, it does not tell us which datasets are different.
Do not try to test a numerical hypothesis on data that is not normally distributed.
The population standard deviations of the groups should be equal.
To check similarity between the standard deviations, it is normally sufficient to divide the two standard deviations and see if the ratio is “close enough” to 1.
“Close enough” may differ in different contexts but generally staying within 10% should suffice.
When comparing two or more datasets, the values in one distribution should not affect the values in another distribution.