Churn Rate Prediction for a bank¶
The basic aim of this notebook is to predict customer churn for a certain bank i.e. which customer is going to leave this bank service.
Neural network will be used as the modelling method for this notebook. The dataset used in this notebook is introduced by Pushkar Mandot on his blog post.
The dataset can be down load here.
The dataset contains 10000 rows with 14 columns. I am not explaining data in detail as dataset is self explanatory.
Importing data¶
# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
# Importing the dataset
dataset = pd.read_csv('data\Churn_Modelling.csv')
print(dataset.shape)
dataset.head()
Creating training and test set¶
Create matrix of features and matrix of target variable. In this case we are excluding column 1 & 2 as those are ‘row_number’ and ‘customerid’ which are not useful in our analysis. Column 14, ‘Exited’ is our Target Variable
X = dataset.iloc[:, 3:13].values
y = dataset.iloc[:, 13].values
Let us take a glimpse on the predictors in X. As can be seen below, the dataset is pretty clean. Only two columns of string variables need to be transferred to categorical variables or one hot values in order to be fed into a classifier.
print(X)
The target y only contains 0s and 1s as 0 stands for customers still with us, and 1 represents customers left us.
print(y)
Encoding categorical variables: we need to use the LabelEncoder and OneHotEncoder from the sklearn to transform string variables in X. Use LabelEncoder first to encode different labels in a certain column to numbers between 0 to n_class-1. Then, use OneHotEncoder to tranform the numbers into one hot manner.
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_X_1 = LabelEncoder()
X[:, 1] = labelencoder_X_1.fit_transform(X[:, 1])
labelencoder_X_2 = LabelEncoder()
X[:, 2] = labelencoder_X_2.fit_transform(X[:, 2])
X
Now you can see that Country names are replaced by 0,1 and 2 while male and female are replaced by 0 and 1.
Label encoding has introduced new problem in our data. LabelEncoder has replaced France with 0, Germany 1 and Spain 2 but Germany is not higher than France and France is not smaller than Spain so we need to create a dummy variable for Country. We don’t need to do same for Gender Variable as it is binary.
Here, we use the OneHotEncoder to do the job.
onehotencoder = OneHotEncoder(categorical_features = [1])
X = onehotencoder.fit_transform(X).toarray()
X = X[:, 1:]
X
Assigning training set and test set¶
# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2)
Preprocessing¶
We are going to fitting and transforming StandardScaler method on both the training. In order to make our model work on test data, we have to standardize our scaling so we will use the same fitted method to transform/scale test data.
Standardize features by removing the mean and scaling to unit variance
The standard score of a sample x is calculated as:
z = (x - u) / s
where u is the mean of the training samples or zero if with_mean=False, and s is the standard deviation of the training samples or one if with_std=False.
# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
Now, the preprocessing on our data is done. We will start building our neural network model. The library we use to build our NN model is Keras. Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano. It was developed with a focus on enabling fast experimentation.
We need Sequential module for initializing NN and Dense module to add Hidden Layers.
# Importing the Keras libraries and packages
import keras
from keras.models import Sequential
from keras.layers import Dense
#Initializing Neural Network
classifier = Sequential()
Adding layers to the neural network. Which activation function should be used is critical task. Here we are using rectifier(relu) function in our hidden layer and Sigmoid function in our output layer as we want binary result from output layer but if the number of categories in output layer is more than 2 then use SoftMax function.
# Adding the input layer and the first hidden layer
classifier.add(Dense(activation="relu", input_dim=11, units=6, kernel_initializer="uniform"))
# Adding the second hidden layer
classifier.add(Dense(activation="relu", units=6, kernel_initializer="uniform"))
# Adding the output layer
classifier.add(Dense(activation="sigmoid", units=1, kernel_initializer="uniform"))
Till now we have added multiple layers to out classifier now let’s compile them which can be done using compile method. Arguments added in final compilation will control whole neural network so be careful on this step.
# Compiling Neural Network
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
We will now train our model on training data but still one thing is remaining. We use fit method to the fit our model In previous some steps I said that we will be optimizing our weights to improve model efficiency so when are we updating out weights? Batch size is used to specify the number of observation after which you want to update weight. Epoch is nothing but the total number of iterations.
# Fitting our model
classifier.fit(X_train, y_train, batch_size = 10, epochs = 100)
Predicting on the test data¶
Predicting the test set result. The prediction result will give you probability of the customer leaving the company. We will convert that probability into binary 0 and 1.
# Predicting the Test set results
y_pred = classifier.predict(X_test)
y_pred = (y_pred > 0.5)
Confusion matrix of test set and prediction accuracy¶
This is the final step where we are evaluating our model performance. We already have original results and thus we can build confusion matrix to check the accuracy of model.
# Creating the Confusion Matrix
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
cm
test_accuracy = np.sum([cm[0,0], cm[1,1]])/np.sum(cm)
test_accuracy
We achieved 85.15% accuracy on the test set which is quite good.
Adjusting the neural network¶
Let's changing the structure of the network to see what happens. Let's add another hidden layer.
#Initializing Neural Network
classifier2 = Sequential()
# Adding the input layer and the first hidden layer
classifier2.add(Dense(activation="relu", input_dim=11, units=8, kernel_initializer="uniform"))
# Adding the second hidden layer
classifier2.add(Dense(activation="relu", units=6, kernel_initializer="uniform"))
# Adding the third hidden layer
classifier2.add(Dense(activation="relu", units=6, kernel_initializer="uniform"))
# Adding the output layer
classifier2.add(Dense(activation="sigmoid", units=1, kernel_initializer="uniform"))
# Compiling Neural Network
classifier2.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
# Fitting our model
classifier2.fit(X_train, y_train, batch_size = 10, epochs = 100)
# Predicting the Test set results
y_pred2 = classifier2.predict(X_test)
y_pred2 = (y_pred2 > 0.5)
cm2 = confusion_matrix(y_test, y_pred2)
test_accuracy2 = np.sum([cm2[0,0], cm2[1,1]])/np.sum(cm2)
test_accuracy2
Now the prediction accuracy on test set has increased a little bit from 85.15% to 85.55%.
Adjusting the architect of the neural network may achieve better results.
Comments