Codementor Events

Logistic Regression Analysis

Published Jul 18, 2020Last updated Jul 22, 2020
Logistic Regression Analysis

Introduction
Logistic regression is a statistical technique that predicts the probability that an observation belongs to one of the two categories of a binary (dichotomous) dependent variable based on one or more independent variables. These independent variables can be either continuous or categorical.

Let's see some of the use cases where we can apply logistic regression analysis to understand the application in a more clear manner.

Example 1: Let's say that we want to predict whether a student will pass or fail in exam based on the amount of time he/she spends on self studying, time spent on revision, state of mind during exam, and number of classes attended. The dependent variable in this use case is whether a student will pass or not (dichotomous variable) and independent variables are (self study time, revision time, % of classes attended).

Example 2: Let's say that a retailer wants to predict whether a user will respond to a marketing campaign or not based on the average amount spent by the user, frequency of purchase by the user, and response to past marketing campaigns.

However, before we perform logistic regression, we need to understand different assumptions that the data should comply to before performing the logistic regression. Let's look at these assumptions:

Assumptions
Assumption #1: The dependent variable should be a dichotomous scale. Examples of dichotomous variables are gender (male and female), exam result (pass and fail), presence of a disease (yes and no).

Assumption #2: The independent variables can be continous or categorical. Some of the examples of continuous variables are study time (measured in hours), weight (measured in kg), etc. Some of the examples of category variables are gender (male and female), height (short, and toll), and eye color (black, blue, and others), etc.

Assumption #3: The dependent variable should have mutually exclusive and exhaustive categories.

Assumption #4: The needs to be a linear relationship between the logit transformed dependent variable and the continuous independent variables.

In the next post, I will try to illustrate how to perform logistic regression using python, R, and SPSS Statistics.

Discover and read more posts from Ashish Soni
get started
post commentsBe the first to share your opinion
Show more replies