August 19, 2020

Aug 19

Logistic Regression

Logistic Regression is a supervised machine learning algorithm that uses regression to predict continuous probability, ranging from 0 to 1, of a data sample belonging to a specific category or class. Then, based on that probability, the sample is classified as belonging to the more probable class, ultimately making Logistic Regression a classification algorithm.
The act of deciding which of two classes a data sample belongs to is called Binary Classification.
To predict the probability of a data sample belonging to a class
1. Initialize all feature coefficients and intercept to 0.
2. Multiply each of the feature coefficients by their respective feature values to get the log-odds.
3. Place the log-odds into the sigmoid function to link the output to the range [0,1], giving us a probability.
  1. ODDS = P (event occurring) / P (event not occurring)
Dot Product - Given feature matix features, coefficient vector coefficients, and intercept, we can calculate the log-odds in NumPy as follows:
1. log_odds = np.dot(features, coefficients) + intercept
Sigmoid Function is a mathematical function having a common characteristic “S”-shaped curve or Sigmoid curve.
1. By plugging the log-odds into a Sigmoid Function, we map the log-odds, z, to the range[0,1]
  1. h (z) = 1/1 +e^(-z)
  2. e^(-z) is the exponential function, which can be written in NumPy as np.exp(-z)