Linear Regression

4 min readJun 10, 2021

Linear Regression is basic and most widely used type of predictive analysis.

Content

Define
Goal of Linear regression
Types of Linear regression
Assumptions of Linear Regression
Evaluations Metrics
Points to Remember
Applications
References

Definition

Linear Regression is one of the simplest supervised machine learning algorithm which helps to find the relationship between one or more independent variables ( predictors ) denoted as X and the dependent variables ( target ) denoted as y.

y ( L.H.S side here ) is also known as Dependent variables or Response variable or Outcome variable.

X ( R.H.S side here ) is also known as independent variables or Explanatory variable or Predictor variable.

In above diagram, the blue dots shows us the distribution of y w.r.t. x. There is no such straight line which runs through all the data points. So, the main aim here is to best fit a regression line, which will try to minimize the error between actual and predicted values.

Finding the best fit line

By minimizing the distance ( or say error ) between all the data points and regression line we can find the best fit line for our dataset. There are different ways using which we can minimize the distance, such as by using sum of squared errors, sum of absolute errors or root mean squared error etc.

Our main aim is to minimize the cost function by updating the different values of θ. The minimize value of cost function will give us the best fit regression line for our dataset.

Types of Linear Regression :

Linear Regression is generally divided into two types:

Simple Linear Regression :- In simple linear regression we have only one explanatory variable X and a corresponding y variable.
Multiple Linear Regression :- In Multiple linear regression we have one or more explanatory variable X and the corresponding y variable.

Assumptions of Linear Regressions :-

Normality :- For any fixed value of X, y is normally distributed.
Linearity :- The relationship between X and y is linear.
Independence :- Observations are independent of each other.
Homoscedasticity :- The variance of residual are same for any value of X.

Evaluation Metrics in Linear Regressions :-

Following are some evaluation metrics for the Linear Regression

Mean Squared Error (MSE) :- MSE basically gives us the average squared difference between the predicted value and the actual value of data. It has convex shape and it penalizes the large errors.