February 7, 2020

Statistics Using Python

  1. Find variance of a dataset.

    1. import numpy as np

    2. dataset = [3, 5, -2, 49, 10]

    3. variance = np.var(dataset)

  2. Variance is useful because it is a measure of spread.

  3. While we might get a general understanding of the spread by looking at a histogram, computing the variance gives us a numerical value that helps us describe the level of confidence of our comparison.

  4. Standard deviation is computed by taking the square root of the variance.

  5. In Python, we can take the square root of a number by using **0.5

  6. The NumPy function std( ) takes a dataset as a parameter and returns the standard deviation of that dataset.

  7. You can usually expect around 68% of your data to fall within one standard deviation of the mean, 95% of your data within 2 standard deviations, and 99.7% of your data to fall within 3 standard deviations of the mean.

Previous
Previous

February 15, 2020

Next
Next

January 29, 2020