Padhai Time

Linear Regression

What is Regression?

Before going into depth about linear regression, allow us to get ourselves familiar with regression. Regression is a technique of predicting a target value based on unbiased predictors. This method is mostly utilized for ascertaining the cause and effect relationship between variables. The number of independent variables and the form of relationship between the independent and dependent variables are the main differences in regression techniques.

Linear Regression:

Linear Regression is the basic form of regression analysis. Assumptions based on the linear regression model are :

Linearity: There should be a linear relationship between dependent and independent variables.
Independence: Observations should be unrelated to each other.
Normality: The dependent variable is normally distributed for every fixed value of an independent variable.
Homoscedasticity: The variance of residuals should be equal.

Objective: To find the best fit line for the relationship between the predictors and the predictive/dependent variable.

The equation for best fit line :

y = B0 + B1*x

B0 is the intercept
B1 is the coefficient of an independent variable
X represents the independent variable
Y represents the dependent variable

We can analyze the effect of changes in independent variables on the dependent variable by using the best fit line.

https://ptime.s3.ap-northeast-1.amazonaws.com/media/machine_learning/regression/linear_regression.PNG

Performance of Linear Regression :

1. Mean Absolute Error (MAE) - The average absolute difference between the real and expected values is calculated.

xi is the actual value
x is the predicted value
n is the number of observations in a dataset

2. Mean Absolute Percentage Error (MAPE) - The average of the absolute deviation of the expected value from the actual value is known as MAPE. It's the average of the absolute difference between the real and expected values divided by the actual value.

3. Root Mean Square Error (RMSE) - The square root average of the total of squared differences between the real and expected values is calculated by RMSE.

4. R-squared values - The percentage of variance in the dependent variable described by the independent variable in the model is represented by the R-square values.

5. Adjusted R-squared values - The Adjusted R2 value eliminates R2's disadvantage. Only if the added variable makes a major contribution to the model will the adjusted R2 value change. In the model, the adjusted R2 value adds a penalty.

Types of Linear Regression Models :

Simple Linear Regression:

Simple Linear Regression is used when there is only one predictor available.
There is an error term associated with the equation when performing the forecast.
The goal of the SLR model is to find the estimated values of β1 & β0 by keeping the error term (ε) minimum.

Multiple Linear Regression:

Multiple Linear Regression is used when there is more than one predictor available.
The MLR model aims to find the approximate values of β0, β1, β2, β3…. while holding the error term to a minimum.
The modified equation of MLR is y = B0 + B1*x1+ B2*x2+ B3*x3....

In our upcoming articles, we will also have a hands-on implementation of linear regression models on real-world data sets using python.

Bengaluru, India

contact.padhaitime@gmail.com