July 20, 2019
MACHINE LEARNING REVIEW
Machine Learning is a set of many different techniques that are each suited to answering different types of questions.
Supervised learning algorithms used labeled data as input while unsupervised learning algorithms use unlabeled data.
We can further distinguish machine learning algorithms by the output they produce.
Regression
Classification
Regression is used to predict outputs that are continuous. The outputs are quantities that can be flexibly determined based on the inputs of the model rather than being confined to a set of possible labels.
Predict the height of a potted plant from the amount of rain.
Predict salary based on age and availability of internet.
Predict MPG based on size and year of car.
Classification is used to predict a discrete label. The outputs fall under a finite set of possible outcomes. Many situations have only two possible outcomes. This is called binary classification.
Predict email is SPAM or not.
Predict if it will rain or not.
Predict if a user is a power user or casual user.
Multi-label Classification is used when there are multiple possible outcomes. It is useful for customer segmentation, image classification, and sentiment analysis for understanding text. To perform these classifications, we use models like Naive Bayes, K-Nearest Neighbors, and SVMs.