Linear Regression

Akash Patel
4 min readJun 10, 2021

--

Linear Regression is basic and most widely used type of predictive analysis.

Content

  1. Define
  2. Goal of Linear regression
  3. Types of Linear regression
  4. Assumptions of Linear Regression
  5. Evaluations Metrics
  6. Points to Remember
  7. Applications
  8. References

Definition

Linear Regression is one of the simplest supervised machine learning algorithm which helps to find the relationship between one or more independent variables ( predictors ) denoted as X and the dependent variables ( target ) denoted as y.

y ( L.H.S side here ) is also known as Dependent variables or Response variable or Outcome variable.

X ( R.H.S side here ) is also known as independent variables or Explanatory variable or Predictor variable.

Linear Regression example

In above diagram, the blue dots shows us the distribution of y w.r.t. x. There is no such straight line which runs through all the data points. So, the main aim here is to best fit a regression line, which will try to minimize the error between actual and predicted values.

Finding the best fit line

By minimizing the distance ( or say error ) between all the data points and regression line we can find the best fit line for our dataset. There are different ways using which we can minimize the distance, such as by using sum of squared errors, sum of absolute errors or root mean squared error etc.

Visualizing Error and Regression line

Our main aim is to minimize the cost function by updating the different values of θ. The minimize value of cost function will give us the best fit regression line for our dataset.

Cost function in Linear Regression

Types of Linear Regression :

Linear Regression is generally divided into two types:

  • Simple Linear Regression :- In simple linear regression we have only one explanatory variable X and a corresponding y variable.
  • Multiple Linear Regression :- In Multiple linear regression we have one or more explanatory variable X and the corresponding y variable.

Assumptions of Linear Regressions :-

  1. Normality :- For any fixed value of X, y is normally distributed.
  2. Linearity :- The relationship between X and y is linear.
  3. Independence :- Observations are independent of each other.
  4. Homoscedasticity :- The variance of residual are same for any value of X.

Evaluation Metrics in Linear Regressions :-

Following are some evaluation metrics for the Linear Regression

  • Mean Squared Error (MSE) :- MSE basically gives us the average squared difference between the predicted value and the actual value of data. It has convex shape and it penalizes the large errors.
Mean Squared Error Equation
  • Mean Absolute Error (MAE) :- It simply gives us the absolute difference between the target value and the predicted value.
Mean Absolute Error Equation
  • Root Mean Squared Error (RMSE) :- It gives us the square root of the average difference of the predicted and the actual value.
Root Mean Squared Error Equation

Points to remember :-

  • It is used to solve the regression problem.
  • The response variables are continuous in nature.
  • Linear Regression is sensitive to outliers.

Application of Linear Regression :

Following are the few applications of linear regression in real life in different domain.

Business Application : ex :- Advertising spending and Revenue

Medical Application : ex :- drug dosage and blood pressure of patients

Agricultural Application : ex :- effect of fertilizer and water on crop yields

References :-

  1. Wikipedia
  2. Towardsdatascience blog
  3. Few other blogs

--

--

No responses yet