Muller Unlimited

View Original

February 15, 2020

QUARTILES

  1. A common way to communicate a high-level overview of a dataset is to find the values that split the data into four groups of equal size.

  2. The first step in finding the quartiles of a dataset is to sort the data from smallest to largest.

  3. Q2 is the median of the entire dataset.

  4. Q1 is the median of the first half of the dataset.

  5. Q3 is the median of the second half of the dataset.

  6. There is no universally agreed upon method of calculating quartiles, and as a result, two different tools might report different results.

    1. An alternative method may include Q2 in calculating Q1 and Q3.

  7. We can calculate quartiles and other quantiles using NumPy.

    1. third_quartile = np.quantile(dataset, 0.75)

    2. Always between 0 and 1

  8. Quartiles are so commonly used that they (Q1, Q2, Q3) along with the min and max of the dataset, are called the 5 Number Summary.

    1. These 5 numbers help you quickly get a sense of the range, centrality, and spread of the dataset.