Making a simple Deep Learning Model With Tensorflow
This article is inspired by the below video, but I will go more into depth about the math and the lines that are glossed over. The ipynb and csv input is found here https://github.com/nicknochnack/Tensorflow-in-10-Minutes.
The Data
We are given a csv of churn data. Churning in business means a person has left a company or business.
import pandas as pd
from sklearn.model_selection import train_test_split
df = pd.read_csv('Churn.csv')
df.head()
These first few lines import some of the necessary libraries we need, read in the Churn data and display the data for us.
We can see we have a lot of input data which may play a part in if the user leaves our service or not like: Streaming TV, Internet Service, Contract etc…
We also have our target column which we want to predict: Churn
X = pd.get_dummies(df.drop(['Churn', 'Customer ID'], axis=1))
y = df['Churn'].apply(lambda x: 1 if x=='Yes' else 0)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.2)
y_train.head()
We then drop the Churn column (because it is the output variable) and Customer ID (since it is a UUID and has no relation to the user’s likliehood to churn).
We create a binary vector, y where we have 1 if they churned and 0 if they did not.
We split the data into a train set (with 80% of the data) and a test set (with 20% of the data).
The Model
from tensorflow.keras.models import Sequential, load_model
from tensorflow.keras.layers import Dense
from sklearn.metrics import accuracy_score
We need to import some packages in order to build the model.
model = Sequential()
model.add(Dense(units=32, activation='relu', input_dim=len(X_train.columns)))
model.add(Dense(units=64, activation='relu'))
model.add(Dense(units=1, activation='sigmoid'))
Architecture
A sequential model is a plain Deep learning model where the outputs of the layer (i) goes into the input of layer (i+1).
We create a layer with 32 output nodes and 22 input nodes (number of columns). A dense layer means that every input node has a connection to every output node. So in this case we have 22 X 32 = 704 connections, in our next layer we have 32 X 64 = 2048 connections and in our final layer we have 64 X 1 = 64 connections.
Activation Functions
Our activation function determines what the value of the output node is. The relu function looks like this.
It only activates when the value is positive, else it gives no impact. So if our input nodes were -10, 5, -15, then relu(-10 + 5 + -15) = relu(-20) = 0. If the sum of these inputs were a positive number we would just get the number. In general relu( n < 0) = 0 and relu( n > 0) = n.
the Sigmoid function is used on the output since we are building a binary classifier and we want to output a number between 0 and 1, so we use the below function which locks the output in that range.
The Loss function
model.compile(loss='binary_crossentropy', optimizer='sgd', metrics='accuracy')
We then define our loss function, our loss function is what we use to train the model. The lower the loss the higher the accuracy.
Binary cross entropy is a logarithmic loss metric used in binary classification models. This creates a convex loss function which makes it easier to find a global minima.
SGD or Stochastic Gradient Descent is the method of taking small steps in the direction of greatest descent on the loss function in order to move towards a minima on the function and therefore a better accuracy.
The Training
model.fit(X_train, y_train, epochs=200, batch_size=32)
We then fit the model using the trainingSet, we do our training in batches of 32 meaning we evaluate 32 results before deciding which direction to move our model parameters in (which is done using Backpropagation). We continue until we have gone through all our training data.
We do this for 200 epochs (or 200 times). In the end, each entry has been trained on 200 times.
Evaluate
y_hat = model.predict(X_test)
y_hat = [0 if val < 0.5 else 1 for val in y_hat]
accuracy_score(y_test, y_hat)
Next, we predict the values of our X_test, round the output up to 1 or down to 0 and compare with our actual outputs to get our accuracy score.
Save and Load
model.save('tfmodel')
del model
model = load_model('tfmodel')
We can also save and load a model.