Regression Analysis

Regression analysis is a very popular predictive modelling technique which investigates the relationship between a dependent (target) and independent variable (s) (predictor). Regression analysis also allows us to compare the effects of variables measured on different scales, such as the effect of price changes and the number of promotional activities. These benefits help market researchers / data analysts / data scientists to eliminate and evaluate the best set of variables to be used for building predictive models.

What are the Applications of Regression analysis ?

  • Regression analysis estimates the relationship between two or more variables.
  • It indicates the significant relationships between dependent variable and independent variable.
  • It indicates the strength of impact of multiple independent variables on a dependent variable.

Regression analysis also allows us to compare the effects of variables measured on different scales, such as the effect of price changes and the number of promotional activities. These benefits help market researchers / data analysts / data scientists to eliminate and evaluate the best set of variables to be used for building predictive models.

What are the few most popular regressions model ?

Following is a list of few most popular types of regression models:

  • Linear Regression
  • Logistic Regression
  • Polynomial/Multiple linear Regression
  • Stepwise Regression
  • Ridge Regression
  • Lasso Regression
  • ElasticNet Regression
  • Multinomial Logistic regression.
  • Ordinal logistic regression

However, Linear and Logistic Regression are the most popular algorithm among all of the regression algorithms.

Which Model Applies when?

  • If the outcome is continuous – apply linear regression.
  • If it is binary – use logistic regression!

Below are the key factors that you should practice to select the right regression model:

  • Data exploration is an inevitable part of building predictive model. It should be you first step before selecting the right model like identify the relationship and impact of variables
  • To compare the goodness of fit for different models, we can analyse different metrics like statistical significance of parameters, R-square, Adjusted r-square, AIC, BIC and error term. Another one is theMallow’s Cp This essentially checks for possible bias in your model, by comparing the model with all possible submodels (or a careful selection of them).
  • Cross-validation is the best way to evaluate models used for prediction. Here you divide your data set into two group (train and validate). A simple mean squared difference between the observed and predicted values give you a measure for the prediction accuracy.
  • If your data set has multiple confounding variables, you should not choose automatic model selection method because you do not want to put these in a model at the same time.
  • It’ll also depend on your objective. It can occur that a less powerful model is easy to implement as compared to a highly statistically significant model.
  • Regression regularization methods(Lasso, Ridge and ElasticNet) works well in case of high dimensionality and multicollinearity among the variables in the data set.

technique which investigates the relationship between a dependent (target) and independent variable (s) (predictor).

What are the differences between Linear and Logistic regression models?

Under Construction !

756 total views, 1 views today