August 11, 2020

Aug 11

More K-Nearest Neighbors Classifier

Classify the new point based on those neighbors
1. What happens when you encounter a tie?
  1. There are different strategies.
  2. One way to break a tie is to choose the class of the closest point.
2. Take all the entries from the validation set and compare them to the training set. We can see how many times our classifier got the number right or wrong to calculate the validation accuracy.
  1. The validation accuracy will change depending on what k we use.
Overfitting occurs when you rely too heavily on your training data. You assume that data in the real world will always behave exactly like your training data.
1. In K-Nearest lNeighbors, overfitting happens when you don’t consider enough neighbors.
Underfitting occurs when your classifier doesn’t pay enough attention to the small quirks in the training set.
Rather than writing your own classifier every time, you can use:
1. from sklearn.neighbors import KNeighborsClassifier
2. classifier = KNeighborsClassifier(n_neighbors = 3)
3. In the code above, k = 3
Next, we train the classifier:
1. .fit( ) takes two parameters (a list of points, labels)
Then, we classify new points:
1. .predict( ) takes a list of points we want to classify.