November 18, 2019
Linear Regression is when you have a group of points on a graph, and you find a line that approximately resembles that group of points.
A good Linear Regression algorithm minimizes the error, or the distance from each point to the line. A line with the least error is the line that fits the data the best. It is called the line of best fit.
y = mx + b
m = slope
b = intercept, where the line crosses the y-axis.
def get_y(m, x, b):
return m * x + b
Pandas is a Python module for working with tabular data. Tabular data has a lot of the same functionality as SQL or Excel, but Pandas adds the power of Python.
A DataFrame is an object that stores data as rows and columns.